
Build a trait-environment GLMM formula safely and flexibly
Source:R/build_glmm_formula.R
build_glmm_formula.Rd
Constructs a model formula for trait-environment analyses in a single step.
The function (i) auto-detects trait and environment columns from a long-format
table, (ii) assembles fixed effects for all traits and all environment variables,
(iii) optionally includes all pairwise \(trait \times environment\) interactions,
and (iv) appends user-specified random-effects terms. The returned object is a
standard formula
suitable for glmmTMB, lme4, etc.
Arguments
- data
data.frame
. Long-format observations (e.g., species-by-site), including the response, species ID, site ID, trait columns, and environment columns.- response
character
(default"count"
). Name of the response variable (e.g., count/abundance).- species_col
character
(default"species"
). Column name identifying species.- site_col
character
(default"site_id"
). Column name identifying sites.- trait_cols
NULL
orcharacter
vector. IfNULL
(default), traits are auto-detected using name prefixes^trait_
,^t_
, or^trt_
. If not found, falls back to “everything not excluded” (seeenv_exclude
). Pass explicit names for full control.- env_cols
NULL
orcharacter
vector. IfNULL
(default), environment variables are auto-detected using name prefixes^env_
,^e_
,^clim_
,^soil_
. If not found, falls back to “everything not in traits and not excluded”.- env_exclude
character
vector. Columns to exclude from environment auto-detection. Defaults toc("site_id","x","y","count","species")
. Adjust to your schema.- include_interactions
logical
(defaultTRUE
). IfTRUE
, adds a single block term(traits):(envs)
which expands to all pairwise \(trait \times environment\) interactions.- random_effects
character
vector. Random-effect terms to append to the RHS (e.g.,"(1 | species)"
). Usecharacter(0)
to omit random effects. Default adds random intercepts for species and site:c("(1 | species)", "(1 | site_id)")
.
Value
A formula
with fixed effects (traits + envs + interactions)
and any requested random effects, e.g.:
count ~ trait_cont1 + ... + trait_cat + env1 + ... + envK +
(trait_cont1 + ... + trait_cat):(env1 + ... + envK) +
(1 | species) + (1 | site_id)
Details
Auto-detection:
Traits: first tries prefixes
^trait_
,^t_
,^trt_
. If none match, uses all columns not inenv_exclude
, notresponse
, notspecies_col
, and notsite_col
.Environment: first tries prefixes
^env_
,^e_
,^clim_
,^soil_
. If none match, uses remaining non-excluded columns not already assigned as traits.
Interactions:
When include_interactions = TRUE
, a single block term
(t1 + t2 + ...):(e1 + e2 + ...)
is inserted; model-fitting packages will
expand it to all pairwise interactions. Disable with FALSE
if the design
is too large or you prefer targeted interactions.
Random effects:
Supplied verbatim (e.g., random intercepts/slopes). For example,
c("(1 | species)", "(1 | site_id)")
or c("(1 + key_trait | species)")
.
Examples
# Minimal reproducible toy example -----------------------------------------
set.seed(1)
n <- 100
longDF <- data.frame(
site_id = factor(sample(paste0("s", 1:10), n, TRUE)),
species = factor(sample(paste0("sp", 1:15), n, TRUE)),
x = runif(n), y = runif(n),
count = rpois(n, lambda = 3),
# traits
trait_cont1 = rnorm(n),
trait_cont2 = rnorm(n),
trait_cat = factor(sample(letters[1:3], n, TRUE)),
# environments
env1 = scale(rnorm(n))[, 1],
env2 = scale(runif(n))[, 1]
)
# Build a full formula with all trait × environment interactions and default REs
fml <- build_glmm_formula(longDF)
fml
#> count ~ trait_cont1 + trait_cont2 + trait_cat + env1 + env2 +
#> (trait_cont1 + trait_cont2 + trait_cat):(env1 + env2) + (1 |
#> species) + (1 | site_id)
#> <environment: 0x563c7bec6e88>
# Example fit (uncomment if glmmTMB is available)
# mod = glmmTMB::glmmTMB(fml, data = longDF, family = glmmTMB::tweedie(link = "log"))
# summary(mod)
# Targeted columns & no interactions
fml2 <- build_glmm_formula(
data = longDF,
trait_cols = c("trait_cont1", "trait_cont2", "trait_cat"),
env_cols = c("env1", "env2"),
include_interactions = FALSE,
random_effects = character(0)
)
fml2
#> count ~ trait_cont1 + trait_cont2 + trait_cat + env1 + env2
#> <environment: 0x563c7bfc2728>