Modelling Species’ Distributions - Modelling Complex Ecological Dynamics

Environmental Engineering Reference

In-Depth Information

plotted to gauge its shape. For interactions we need to plot each variable at each

level of the other variable, thus visualizing synergistic or compensatory effects of

the two variables.

The key idea behind SDM, i.e. the environmental niche of a species, implies a

hump-shaped relationship between any environmental predictor and a species'

occurrence: there are lower and upper limits. Hence, we must allow the model to

be nonlinear. If we happen to only sample a part of the entire gradient, we also

need to consider saturation curves, which are again non-linear. The simplest, and

generally sufficient, way to include non-linearity is by generating a new, squared

dummy variable for each continuous predictor. 8 This represents the third element

of a Taylor series (which can be expanded to represent any continuous function).

When using GAM or other spline-based approaches, non-linearity is governed by

the smoothing function used. Here the issue is not so much how to model non-

linearity, but rather how much non-linearity we allow for. Reducing the “wiggli-

ness” of splines (either by stepwise model selection for the number of knots in

each predictor 9 or by shrinkage of spline fits 10 ) prevents over-fitting and should

be the standard approach.

Interactions are similarly relevant. Statistically, an interaction is the product of

the participating main effects. Ecologically, it means that we need to know the

value of all variables included in the interaction, not only the main effects. Because

this is highly relevant and often difficult for the beginner, let me briefly give an

example. Assume that global patterns of plant diversity are well-predicted by the

predictors “annual precipitation” and “mean annual temperature” - and their

interaction. For the main effects, wet or hot means more species, but not necessar-

ily. When a site is hot, it needs to also be wet to have high species richness;

otherwise it may well be a barren desert. But when cold, a site will never support

many plant species, independent of precipitation. In this example, neither tempera-

ture nor rainfall alone is sufficient to predict species richness at any site, but we

need to interpret them in concert.

Classification and regression trees (CARTs) embrace non-linearity and interac-

tions in an elegant and natural way. Their boosted (BRT) or bagging (randomFor-

est) extensions hence do not require specification of non-linearity and interactions.

Model Simplification

One of the fundamental problems in building statistical models is the trade-off

between the variance explained by the model, and the bias it produces when

8

This can be done either manually (X1.2 < -X1^2) or as part of the model formula (y~X1+I

(X1^2)); higher-order polynomials should be specified using poly (y ~ poly(X1, degree ¼ 3)),

which calculates orthogonal polynomials.

9

As proposed for the function gam in package gam: see ?gam::step.gam.

10 As proposed for the function gam in package mgcv: see ?mgcv::step.gam.

Search WWH ::

Custom Search

Home