Geology Reference
In-Depth Information
model residuals often show complex structures
that can involve correlation in time and/or
space; heteroscedasticity (changing variance with
magnitude of prediction); non-Gaussian distribu-
tions; and non-stationary bias. Some of these
characteristics can be allowed for by modelling or
transforming the residuals to allow a simple like-
lihood function to be used. There is a danger in
doing so, however. If the model of the residuals is
too simple, then it will result in overconditioning
of the parameter distribution (see, for example,
Beven, 2006a; Beven et al ., 2008).
The problem of integrating the likelihood
function over the parameter space is directly
related to the choice of likelihood function. This
is because the shape of the response surface in the
parameter space will reflect that choice: in par-
ticular the 'peakiness' of the surface. When the
posterior parameter distribution is strongly con-
ditioned by the observations and the assumed
likelihood function, then the region of high like-
lihood might be very local in the parameter space.
If only a small number of parameters is being con-
sidered then this is not a problem, but the higher
the dimensions of the parameter space, the more
difficult it becomes to find a local high likelihood
region, especially if non-informative prior distri-
butions, or incorrect prior distributions, have
been assumed.
The most common technique used to integrate
the likelihood function is the range of Monte
Carlo Markov Chain (MCMC) methods (e.g.
Gamerman, 1997; Beven, 2009). In MCMC meth-
ods, an initial sample is used to guide the next set
of samples in a way that should lead to samples
with a density in the parameter space propor-
tional to the likelihood. The greater the number
of samples used, the better the integration of the
likelihood function. Tests for convergence of the
posterior distribution can be used as more sam-
ples are added. Sampling (and consequently rates
of convergence) can be controlled by parameters
of the MCMC algorithm. For complex cases with
high parameter dimensions, convergence may
take a very large number of samples with sudden
jumps from one part of the parameter space to
another. This is indicative that, for these com-
plex cases, the likelihood surface may not be sim-
ple, but may involve a number of different peaks
in the surface. Where the nature of the correct
definition of likelihood function is in doubt, there
is also no guarantee that the optimum for one
likelihood function will be in the same region as
an alternative likelihood function. This is signifi-
cant because the validity of a likelihood function
is generally checked with respect to the residuals
produced by the model with parameter values at
the peak of a likelihood function. It is therefore
possible that the structure of the residuals might
support the choice of likelihood function, or not.
If not, then the series of residuals might suggest
an alternative form of likelihood function, but
since the maximum likelihood model is then
likely to have quite different parameters, it might
also have different residual characteristics again.
It is therefore easier to reject a likelihood func-
tion than to find the 'best' likelihood function.
There are examples in the hydrological literature
where the residuals clearly do not support the
chosen likelihood function (but see Engeland
et al ., 2005, for an example of good practice in
checking residual structures).
This issue of an appropriate likelihood func-
tion is important because of the way in which the
assumption of randomness of residuals in statis-
tical inference implies that every residual is
informative. The result is then to provide very
strong conditioning to the likelihood (make the
likelihood surface very peaked) as the number of
residuals included in the evaluation increases
(see Mantovan & Todini, 2006; Beven et al ., 2008).
When time series of data are available (as is often
the case when models are evaluated with dis-
charge data and continuous turbidity measure-
ments, for example) then the conditioning can be
very strong indeed and the resulting marginal
posterior parameter distributions are well defined
with small variance.
Now this could be interpreted as indicating
that the model is very well defined (even if the
residual variance might still be quite high), but
this is dangerous because in most environmental
modelling it is not true that every residual can be
considered informative because of the complex
Search WWH ::




Custom Search