Environmental Engineering Reference
In-Depth Information
uncorrelated variables, or “consider” the correlation when estimating model para-
meters.
So where is the problem? Imagine an organism whose distribution is governed
entirely by its sensitivity to frost. When we combine our climate variables into one
or more principal components, model the species' distribution, and then predict to a
climate change scenario, the fact that both rainfall and mean summer temperature
are correlated with number of frost days will dilute its impact in the model. The
total effect of “frost” is distributed over all correlated variables. As a consequence,
any climate prediction will underestimate the effect of frost and hence yield a
“wrong” expected future distribution. If we don't know the true underlying causal
mechanism, no statistics can help us here (or at least very little). Any correct
ecological knowledge used in variable pre-selection, however, will lead to a smaller
bias in scenario projections!
Dimensional Reduction
Often we may have dozens or even hundreds of potential explanatory variables (e.g.
from multispectral remote sensing or landscape metrics). We should try to reduce
this set to as few as possible for two reasons: (1) The more variables we have, the
more they will be correlated. (2) The more variables we have, the more likely one of
them will spuriously contribute to our model (type I error). For SDMs, Austin
(2002) and Guisan and Thuiller (2005) argue that we should choose “resource” over
“direct” and “direct” over “indirect” variables. For example, the abundance of prey
(hardly ever available) or nesting opportunities will be a resource variable when
analysing the distribution pattern of a bird of prey. Temperature or human distur-
bance could be direct variables, impacting on the bird without moderation by other
variables. Indirect variables would be altitude or length of road in a grid cell, which
are substitutes, surrogates or proxies for other, more directly acting variables. These
indirect variables are often not immediately perceivable by the organism (such as
altitude by a plant or length of road verges by a rodent). So if we have two
(correlated) variables, we should discard the one “further away” from the species'
ecology.
If we are unable to reduce the data set sufficiently (i.e. k
N), we should use
dimensional reduction techniques, such as Principal Component Analysis 5 or its
more sophisticated variants that also allow categorical variables (nMDS 6 ). The
scores for the most important axes in this new parameter hyperspace can be used as
explanatory variables. Note that interpretation is often extremely impaired by
automatic dimensional reductions. It is thus always advisable to use ecological
understanding rather than statistical functions at this step!
5
prcomp
6
isoMDS in MASS or, more conveniently, metaMDS in vegan
Search WWH ::




Custom Search