Simulation workflow • erahumed

This document illustrates the simulation workflow of ERAHUMED. Specifically, we will cover:

How to setup and run a simulation.
How to extract and analyze simulation results.

This guide is addressed to users working with the command-line (i.e. R) interface of erahumed.

library(erahumed)

ERAHUMED simulation can be run with:

sim <- erahumed_simulation()
sim
#> A ERAHUMED simulation.
#> 
#> Start date: 2020-01-01 
#> End date: 2020-12-31
#> 
#> Need help extracting simulation outputs? Check `?get_results`.

The command above runs a simulation with the default model parameters. These can be customized through the arguments of erahumed_simulation(), for instance:

sim2 <- erahumed_simulation(
  variety_prop = c(J.Sendra = 0.6, Bomba = 0.2, Clearfield = 0.2),
  ideal_flow_rate_cm = 10,
  seed = 841
)
#> Warning in generate_clusters_variety(variety_prop): Surface proportion
#> allocated to 'Clearfield' was too high. Reduced to 0.162698421372703

The full set of simulation parameters is documented in ?erahumed_simulation (see here for a table format).

Once we are ready with our simulation setup, in order to actually run the simulation we use:

Simulation results are extracted as follows:

lake_hydrology_df <- get_results(sim, component = "hydrology", element = "lake")
cluster_hydrology_df <- get_results(sim, component = "hydrology", element = "cluster")
cluster_exposure_df <- get_results(sim, component = "exposure", element = "cluster")

These are provided in the form of data.frames, for instance:

head(cluster_hydrology_df)
#>   ideal_height_eod_cm ideal_irrigation ideal_draining petp_cm   area_m2
#> 1                  20             TRUE           TRUE  -0.058 114881.78
#> 2                  20             TRUE           TRUE  -0.058 116539.90
#> 3                  20             TRUE           TRUE  -0.058 154730.35
#> 4                  20             TRUE           TRUE  -0.058 163789.56
#> 5                  20             TRUE           TRUE  -0.058  83016.51
#> 6                  20             TRUE           TRUE  -0.058 106260.07
#>   capacity_m3_s       date                element_id ditch_element_id seed_day
#> 1    0.05741888 2020-01-01 02_Carrera_del_Saler0-2_0               d2     -110
#> 2    0.05741888 2020-01-01          03_Petxinar0-3_2               d2     -110
#> 3    0.05741888 2020-01-01          03_Petxinar0-3_3               d2     -110
#> 4    0.05741888 2020-01-01          03_Petxinar1-3_1               d2     -110
#> 5    0.05741888 2020-01-01          03_Petxinar1-3_2               d2     -110
#> 6    0.05741888 2020-01-01          03_Petxinar1-3_3               d2     -110
#>   tancat    variety height_sod_cm irrigation draining ideal_diff_flow_cm
#> 1   TRUE   J.Sendra            20       TRUE     TRUE              0.058
#> 2   TRUE   J.Sendra            20       TRUE     TRUE              0.058
#> 3   TRUE   J.Sendra            20       TRUE     TRUE              0.058
#> 4   TRUE Clearfield            20       TRUE     TRUE              0.058
#> 5   TRUE Clearfield            20       TRUE     TRUE              0.058
#> 6   TRUE      Bomba            20       TRUE     TRUE              0.058
#>   ideal_inflow_cm ideal_outflow_cm outflow_m3_s outflow_cm inflow_cm
#> 1               5            4.942            0          0     0.058
#> 2               5            4.942            0          0     0.058
#> 3               5            4.942            0          0     0.058
#> 4               5            4.942            0          0     0.058
#> 5               5            4.942            0          0     0.058
#> 6               5            4.942            0          0     0.058
#>    inflow_m3_s height_eod_cm plan_delay
#> 1 0.0007711971            20          0
#> 2 0.0007823281            20          0
#> 3 0.0010386991            20          0
#> 4 0.0010995133            20          0
#> 5 0.0005572868            20          0
#> 6 0.0007133199            20          0

From here on, the analysis may proceed in the way you find more convenient. For instance, in the chunk below I create a plot of water levels for a set of clusters with similar features, using dplyr and ggplot2:

library(dplyr)
library(ggplot2)

ditch <- "d4"
tancat <- FALSE
variety <- "Clearfield"

clusters_df <- cluster_hydrology_df |>
  filter(ditch == !!ditch, tancat == !!tancat, variety == !!variety)

avg_df <- clusters_df |>
  group_by(date) |>
  summarise(height_eod_cm = mean(height_eod_cm))

ggplot() +
  geom_line(
    data = clusters_df,
    mapping = aes(x = date, y = height_eod_cm, group = element_id),
    color = "black", linewidth = 0.1, alpha = 0.2) +
  geom_line(
    data = avg_df, 
    mapping = aes(x = date, y = height_eod_cm),
    color = "black"
    ) +
  xlab("Date") + ylab("Height [cm]") + 
  ggtitle("Cluster simulated water levels",
          paste("Ditch:", ditch, "- Tancat:", tancat, "- Variety:", variety)
          )

Further information

Further details will appear in this and possibly other vignettes. For specific problems, you can file an issue on Github.