Geology Reference
In-Depth Information
the input data and model structure are known to
be correct and random residuals are created by con-
struction, then an arbitrary choice of informal
likelihood will not reproduce the results of a for-
mal, and in hypothetical cases objective, statistical
analysis (Mantovan & Todini, 2006; Stedinger
et al ., 2008). There is then a suggestion that if the
use of informal likelihood measures in GLUE does
not work for such hypothetical examples, then
why should we expect them to work in real appli-
cations (see also the series of discussions in Beven,
2006b; Todini & Mantovan, 2007; Hall et al ., 2007,
Montanari, 2007; Andréassian et al ., 2007).
A response to these arguments is provided by
Beven (2006a) who discusses the differences
between ideal and non-ideal (real application)
cases, and Beven et al . (2008) who show that even
mild departures from the correct assumptions of
a hypothetical case can lead to bias in parameter
estimates. The case against the use of informal
likelihoods is not so clear-cut when there is a
danger of overconditioning of parameters if the
strong assumptions of a formal likelihood are not
met. In real applications, therefore, when input
errors and model structural errors might be
important, we need to be more circumspect about
the objectivity of formal statistical inference. The
GLUE methodology then has some features that
can be advantageous (and, as noted above, can
incorporate formal methods as special cases when
it is felt that the strong assumptions can be
justified).
One such feature is the implicit handling of
residual errors, in all their complexity. When a
particular realization of a model is compared with
observations, the residual errors are known
exactly. In formal statistical methods, a model of
those errors is then proposed and the parameters
identified. This model is then assumed to hold
when the model is used for prediction. If, as in
most applications of GLUE, the treatment of the
residual error is left implicit, a similar assump-
tion holds; that the characteristics of the errors
for a model belonging to the behavioural set and
used in prediction are similar (in all their com-
plexity) in prediction. If a behavioural model is
overpredicting in evaluation under certain condi-
tions, then we expect it to overpredict in other
similar conditions. Likewise, we expect it to
underpredict where it has underpredicted in the
past. Thus, in weighting the predictions of that
model, there is also effectively a weighting of the
residual errors implied by that model. If a model
is a good representation of the system response
then we might expect some model realizations to
underpredict and others to overpredict in differ-
ent parts of the evaluation period in some con-
sistent way. It is then likely that in prediction the
set of models will bracket the observations (as
demonstrated for hypothetical examples by Beven
et al ., 2008; Smith et al ., 2008).
The more interesting case occurs when the
model cannot bracket the observations in calibra-
tion (and we therefore should not expect it to do
so in evaluation or prediction). This suggests that
there is some lack of knowledge about the data or
processes that is leading to (possibly non-station-
ary) bias in the model predictions (see Section 4.4
below). There might be many reasons for this,
most importantly model structure error or input
error, but experience suggests it is a generic prob-
lem in the application of environmental models.
These are just the conditions when an evaluation
of residuals might reveal difficulties in formulat-
ing a simple statistical error model in the formal
Bayes approach. It led Beven (2006a) to suggest a
different way of approaching model evaluation by
defining limits of acceptability prior to making
any model runs. The limits of acceptability might
be based purely on the user requirements (how
good do the predictions need to be?), or on a con-
sideration of the errors in both input data and the
observations with which the model is being com-
pared. The idea is then that models are treated as
members of the behavioural set if their predic-
tions lie within the limits of acceptability, and
are considered non-behavioural if they lie with-
out. Behavioural models can be given a likelihood
based on their performance within the limits of
acceptability. There is only limited experience
with this approach (although it was already being
used in, for example, Freer et al ., 2004; Page et al .,
2007; Iorgulescu et al ., 2005, 2007), but it seems
that it can be rather difficult to find models that
Search WWH ::




Custom Search