This function is mainly used within pglmm
but can also be used independently to
prepare a list of random effects, which then can be updated by users for more complex models.
prep_dat_pglmm(
formula,
data,
cov_ranef = NULL,
repulsion = FALSE,
prep.re.effects = TRUE,
family = "gaussian",
add.obs.re = TRUE,
bayes = FALSE,
bayes_nested_matrix_as_list = FALSE
)
A two-sided linear formula object describing the mixed effects of the model.
To specify that a random term should have phylogenetic covariance matrix along
with non-phylogenetic one, add __
(two underscores) at the end of the group variable;
e.g., + (1 | sp__)
will construct two random terms,
one with phylogenetic covariance matrix and another with non-phylogenetic (identity) matrix.
In contrast, __
in the nested terms (below) will only create a phylogenetic covariance matrix.
Nested random terms have the general form (1|sp__@site__)
which represents
phylogenetically related species nested within correlated sites.
This form can be used for bipartite questions. For example, species could be
phylogenetically related pollinators and sites could be phylogenetically related plants, leading to
the random effect (1|insects__@plants__)
. If more than one phylogeny is used, remember to add
all to the argument cov_ranef = list(insects = insect_phylo, plants = plant_phylo)
. Phylogenetic correlations can
be dropped by removing the __
underscores. Thus, the form (1|sp@site__)
excludes the phylogenetic
correlations among species, while the form (1|sp__@site)
excludes the correlations among sites.
Note that correlated random terms are not allowed. For example,
(x|g)
will be the same as (0 + x|g)
in the lme4::lmer
syntax. However,
(x1 + x2|g)
won't work, so instead use (x1|g) + (x2|g)
.
A data.frame
containing the variables named in formula.
A named list of covariance matrices of random terms. The names should be the
group variables that are used as random terms with specified covariance matrices
(without the two underscores, e.g. list(sp = tree1, site = tree2)
). The actual object
can be either a phylogeny with class "phylo" or a prepared covariance matrix. If it is a phylogeny,
pglmm
will prune it and then convert it to a covariance matrix assuming Brownian motion evolution.
pglmm
will also standardize all covariance matrices to have determinant of one. Group variables
will be converted to factors and all covariance matrices will be rearranged so that rows and
columns are in the same order as the levels of their corresponding group variables.
When there are nested random terms specified, repulsion = FALSE
tests
for phylogenetic underdispersion while repulsion = FALSE
tests for overdispersion.
This argument is a logical vector of length either 1 or >1.
If its length is 1, then all covariance matrices in nested terms will be either
inverted (overdispersion) or not. If its length is >1, then you can select
which covariance matrix in the nested terms to be inverted. Make sure to get
the length right: for all the terms with @
, count the number of "__"
to determine the length of repulsion. For example, sp__@site
and sp@site__
will each require one element of repulsion
, while sp__@site__
will take two
elements (repulsion for sp and repulsion for site). Therefore, if your nested terms are
(1|sp__@site) + (1|sp@site__) + (1|sp__@site__)
, then you should set the
repulsion to be something like c(TRUE, FALSE, TRUE, TRUE)
(length of 4).
Whether to prepare random effects for users.
Either "gaussian" for a Linear Mixed Model, or
"binomial" or "poisson" for Generalized Linear Mixed Models.
"family" should be specified as a character string (i.e., quoted). For binomial and
Poisson data, we use the canonical logit and log link functions, respectively.
Binomial data can be either presence/absence, or a two-column array of 'successes' and 'failures'.
For both binomial and Poisson data, we add an observation-level
random term by default via add.obs.re = TRUE
. If bayes = TRUE
there are
two additional families available: "zeroinflated.binomial", and "zeroinflated.poisson",
which add a zero inflation parameter; this parameter gives the probability that the response is
a zero. The rest of the parameters of the model then reflect the "non-zero" part part
of the model. Note that "zeroinflated.binomial" only makes sense for success/failure
response data.
Whether to add an observation-level random term for binomial or Poisson
distributions. Normally it would be a good idea to add this to account for overdispersion,
so add.obs.re = TRUE
by default.
Whether to fit a Bayesian version of the PGLMM using r-inla
.
For bayes = TRUE
, prepare the nested terms as a list of length of 4 as the old way?
A list with updated formula, random.effects, and updated cov_ranef.