This is a helper function for preparing PDA external objects, but it doesn't cover all the cases yet, use it with care You can use this function just to generate either one of the external.* PDA objects, but note that some args cannot be blank depending on what you aim to generate

pda.generate.externals(
  external.data = FALSE,
  obs = NULL,
  varn = NULL,
  varid = NULL,
  n_eff = NULL,
  align_method = "match_timestep",
  par = NULL,
  model_data_diag = FALSE,
  model.out = NULL,
  start_date = NULL,
  end_date = NULL,
  external.formats = FALSE,
  external.priors = FALSE,
  prior.list = NULL,
  external.knots = FALSE,
  knots.list = NULL,
  ind.list = NULL,
  nknots = NULL
)

Arguments

external.data

boolean, if TRUE function will generate external.data for PDA, then you need to pass varn and obs too, as well as align_method if different than "match_timestep"

obs

your data as a(n ordered) list where each sublist corresponds to a data frame of your constraining variable with two columns, variable name - posix IMPORTANT: your obs must be in the same units as PEcAn standards already, this function doesn't do unit conversions! IMPORTANT: your obs must be ready to compare with model outputs in general, e.g. if you're passing flux data it should already be ustar filtered e.g. obs[[1]] NEE posix 4.590273e-09 2017-01-01 00:00:00 NA 2017-01-01 00:30:00 NA 2017-01-01 01:00:00 NA 2017-01-01 01:30:00 NA 2017-01-01 02:00:00 4.575248e-09 2017-01-01 02:30:00 if you have more than variable make sure the order you pass the data is the same as varn. E.g. for varn=c("NEE", "Qle"), external.data should be obs[[1]] NEE posix NA 2018-05-09 NA 2018-05-10 NA 2018-05-11 NA 2018-05-12 ... ... ... obs[[2]] Qle posix NA 2018-05-09 NA 2018-05-10 NA 2018-05-11 NA 2018-05-12 ... ... ...

varn

a vector of PEcAn standard variable name(s) to read from model outputs, e.g. c("NEE", "Qle")

varid

a vector of BETY variable id(s) of your constraints, e.g. for varn = c("NEE", "Qle"), varid = c(297, 298)

n_eff

effective sample size of constraints, PDA functions estimates it for NEE and LE, and uses it in the heteroskedastic Laplacian only, if you already know it passing it now will save you some time

align_method

one of the benchmark::align_data align_method options "match_timestep" or "mean_over_larger_timestep", defaults to "match_timestep"

par

list with vector sublists of likelihood parameters of heteroskedastic laplacian for flux data, function calculates it if NULL for NEE, FC, and Qle. Leave empty for other variables e.g. AMF.params <- PEcAn.uncertainty::flux.uncertainty(...fill in...) par <- list(c(AMF.params$intercept, AMF.params$slopeP, AMF.params$slopeN))

model_data_diag

optional for diagnostics, if you want to check whether your model and data will be aligned in PDA properly you can return a dataframe as well as plot a quick & dirty timerseries graph

model.out

an example model output folder to align your data with model, e.g. "/data/workflows/PEcAn_15000000111/out/15000186876"

start_date

the start date of the model.out run, e.g. "2017-01-01"

end_date

the end date of the model.out run, e.g. "2018-12-31"

external.formats

boolean, if TRUE make sure to pass the varn argument

external.priors

boolean, if TRUE pass prior.list argument too

prior.list

a list of prior dataframes (one per pft, make sure the order is the same as it is in your <assim.batch> block), if you're using this make sure the targeted parameters are on the list e.g. prior.list <- list(data.frame(distn = c("norm", "beta"), parama = c(4, 1), paramb = c(7,2), n = rep(NA, 2), row.names = c("growth_resp_factor", "leaf_turnover_rate")), data.frame(distn = c("unif", "unif"), parama = c(10, 4), paramb = c(40,27), n = rep(NA, 2), row.names = c("psnTOpt", "half_saturation_PAR")))

external.knots

boolean, if TRUE pass prior.list, ind.list, nknots OR knots.list arguments too

knots.list

a list of dataframes (one per pft) where each row is a parameter vector, i.e. training points for the emulator. If not NULL these are used, otherwise knots will be generated using prior.list, ind.list and nknots.

ind.list

a named list of vectors (one per pft), where each vector indicates the indices of the parameters on the prior.list targeted in the PDA e.g. ind.list <- list(temperate.deciduous = c(2), temperate.conifer = c(1,2))

nknots

number of knots you want to train the emulator on

Examples

if (FALSE) { # \dontrun{
pda.externals <-  pda.generate.externals(external.data   = TRUE, obs = obs, 
varn = "NEE", varid = 297, n_eff = 106.9386,
external.formats = TRUE, model_data_diag = TRUE, 
model.out = "/tmp/out/outdir",
start_date = "2017-01-01", end_date = "2018-12-31")
} # }