Skip to contents

This document illustrates the basic simulation workflow of ERAHUMED. Specifically, we will cover:

  • How to setup and run a simulation.
  • How to extract and analyze simulation results.

The goal is to provide a practical, step-by-step overview of how to perform simulations. For a detailed explanation of the underlying models, algorithms, and assumptions, refer to the user manual.

This guide is intended for users working with the R interface of the package and is not required for those using only the Shiny GUI.

Running a simulation

The main interface for running simulations in erahumed is the erahumed_simulation() function:

sim <- erahumed_simulation()
#> Initializing inputs
#> Computing hydrology: lake
#> Computing hydrology: clusters
#> Computing hydrology: ditches
#> Computing exposure: clusters
#> Computing exposure: ditches
#> Computing exposure: lake
#> Computing risk: clusters
#> Computing risk: ditches
#> Computing risk: lake
sim
#> <ERAHUMED Simulation>
#>   Date range             : 2020-01-01 to 2020-12-31 
#>   Simulation days        : 366 
#>   Clusters               : 552 
#>   Management systems     : 2 
#>   Chemicals simulated    : 8 
#>   Total applications     : 17
#> 
#> Need help extracting simulation outputs? Check `?get_results`.

This function handles both the setup and execution of a simulation. Simulation parameters are specified via its arguments, and calling the function launches the simulation and returns a fully executed object containing the results (note that this may take some time).

The example above runs a simulation with the default model parameters. These can be customized via the arguments of erahumed_simulation(). For instance:

sim2 <- erahumed_simulation(foc_ss = 0.20, foc_sed = 0.07)
#> Initializing inputs
#> Computing hydrology: lake
#> Computing hydrology: clusters
#> Computing hydrology: ditches
#> Computing exposure: clusters
#> Computing exposure: ditches
#> Computing exposure: lake
#> Computing risk: clusters
#> Computing risk: ditches
#> Computing risk: lake
sim2
#> <ERAHUMED Simulation>
#>   Date range             : 2020-01-01 to 2020-12-31 
#>   Simulation days        : 366 
#>   Clusters               : 552 
#>   Management systems     : 2 
#>   Chemicals simulated    : 8 
#>   Total applications     : 17

runs a simulation with modified environmental parameters (fraction of organic content in suspended solid and sediment), while:

sim3 <- erahumed_simulation(date_start = "2019-01-01", date_end = "2019-12-31")
#> Initializing inputs
#> Computing hydrology: lake
#> Computing hydrology: clusters
#> Computing hydrology: ditches
#> Computing exposure: clusters
#> Computing exposure: ditches
#> Computing exposure: lake
#> Computing risk: clusters
#> Computing risk: ditches
#> Computing risk: lake
sim3
#> <ERAHUMED Simulation>
#>   Date range             : 2019-01-01 to 2019-12-31 
#>   Simulation days        : 365 
#>   Clusters               : 552 
#>   Management systems     : 2 
#>   Chemicals simulated    : 8 
#>   Total applications     : 17

runs a simulation over a different date range.

The full set of simulation parameters is documented in the user manual, as well as in the R documentation page ?erahumed_simulation. We highlight a few special parameters:

  • Observational inputs - The outflows_df and weather_df arguments of erahumed_simulation() are data-frames containing time-series data that serve as the empirical basis for ERAHUMED simulations. Further details are provided in the user manual. The date_start and date_end arguments (see above) must fall within the time range covered by these datasets.

  • Rice-field (agrochemical) management system map - The rfms_map argument argument is the primary interface for advanced scenario customization, encapsulating the full set of user-defined agrochemical configurations (including custom chemicals and Rice-Field Management Systems, or RFMSs). A conceptual overview of these capabilities is provided in the user manual, while a step-by-step guide to creating custom scenarios is available in a dedicated vignette.

  • Random seed - The seed argument controls the random number generator used in the simulation, ensuring reproducible results when stochastic elements are involved (e.g., the random order in which rice field clusters are drained during the sowing season). Setting a fixed value allows you to obtain identical results when re-running the same simulation; leaving it unset may produce slightly different outcomes across runs.

Analyzing simulation results

Simulation results are extracted as follows:

lake_hydrology_df <- get_results(sim, component = "hydrology", element = "lake")
cluster_hydrology_df <- get_results(sim, component = "hydrology", element = "cluster")
cluster_exposure_df <- get_results(sim, component = "exposure", element = "cluster")

These are provided in the form of data.frames, for instance:

head(cluster_hydrology_df)
#>   ideal_height_eod_cm ideal_irrigation ideal_draining is_plan_delays_window
#> 1                  20             TRUE           TRUE                 FALSE
#> 2                  20             TRUE           TRUE                 FALSE
#> 3                  20             TRUE           TRUE                 FALSE
#> 4                  20             TRUE           TRUE                 FALSE
#> 5                  20             TRUE           TRUE                 FALSE
#> 6                  20             TRUE           TRUE                 FALSE
#>   petp_cm   area_m2 capacity_m3       date                element_id
#> 1  -0.058 114881.78    4960.991 2020-01-01 02_Carrera_del_Saler0-2_0
#> 2  -0.058 116539.90    4960.991 2020-01-01          03_Petxinar0-3_2
#> 3  -0.058 154730.35    4960.991 2020-01-01          03_Petxinar0-3_3
#> 4  -0.058 163789.56    4960.991 2020-01-01          03_Petxinar1-3_1
#> 5  -0.058  83016.51    4960.991 2020-01-01          03_Petxinar1-3_2
#> 6  -0.058 106260.07    4960.991 2020-01-01          03_Petxinar1-3_3
#>   ditch_element_id seed_day tancat rfms_id  rfms_name height_sod_cm irrigation
#> 1               d2     -110   TRUE       2 Clearfield            20       TRUE
#> 2               d2     -110   TRUE       2 Clearfield            20       TRUE
#> 3               d2     -110   TRUE       2 Clearfield            20       TRUE
#> 4               d2     -110   TRUE       2 Clearfield            20       TRUE
#> 5               d2     -110   TRUE       2 Clearfield            20       TRUE
#> 6               d2     -110   TRUE       2 Clearfield            20       TRUE
#>   draining ideal_diff_flow_cm ideal_inflow_cm ideal_outflow_cm outflow_m3
#> 1     TRUE              0.058               5            4.942          0
#> 2     TRUE              0.058               5            4.942          0
#> 3     TRUE              0.058               5            4.942          0
#> 4     TRUE              0.058               5            4.942          0
#> 5     TRUE              0.058               5            4.942          0
#> 6     TRUE              0.058               5            4.942          0
#>   outflow_cm inflow_cm inflow_m3 height_eod_cm plan_delay
#> 1          0     0.058  66.63143            20          0
#> 2          0     0.058  67.59314            20          0
#> 3          0     0.058  89.74360            20          0
#> 4          0     0.058  94.99795            20          0
#> 5          0     0.058  48.14958            20          0
#> 6          0     0.058  61.63084            20          0

From here on, the analysis may proceed in the way you find more convenient. For instance, in the chunk below I create a plot of water levels for a set of clusters with similar features, using dplyr and ggplot2:

library(dplyr)
library(ggplot2)

ditch <- "d4"
tancat <- FALSE
rfms_name <- "Clearfield"  # Rice field management system

clusters_df <- cluster_hydrology_df |>
  filter(ditch == !!ditch, tancat == !!tancat, rfms_name == !!rfms_name)

avg_df <- clusters_df |>
  group_by(date) |>
  summarise(height_eod_cm = mean(height_eod_cm))

ggplot() +
  geom_line(
    data = clusters_df,
    mapping = aes(x = date, y = height_eod_cm, group = element_id),
    color = "black", linewidth = 0.1, alpha = 0.2) +
  geom_line(
    data = avg_df, 
    mapping = aes(x = date, y = height_eod_cm),
    color = "black"
    ) +
  xlab("Date") + ylab("Height [cm]") + 
  ggtitle("Cluster simulated water levels",
          paste("Ditch:", ditch, "- Tancat:", tancat, "- RFMS:", rfms_name)
          )

Further information

For additional details not covered in this guide, see the other package vignettes. If there is a specific topic you would like to see documented, please let us know by filing an issue on GitHub or by using the contact information provided on the package homepage.