Environmental Engineering Reference
In-Depth Information
a variety of statistical methods for identifying model
order, some of which are mentioned later. In general,
however, they exploit some order identification statistics,
such as the correlation-based statistics popularized by Box
and Jenkins (1970), the well known Akaike Information
Criterion (AIC: Akaike, 1974), and the more heuristic YIC
statistic (Young et al ., 1996), which provides an alternative
to the AIC in the case of transfer functions (where the
AIC tends to identify over-parameterized models: see the
discussion in Chapter 2).
they term 'calibration and verification', their criticisms
are rather weak and appear to be based on a perception
that 'models almost invariably need additional tuning
during the verification stage'. While some modellers may
be unable to resist the temptation to carry out such addi-
tional tuning, so negating the objectivity of the validation
exercise, it is a rather odd reason for calling the whole
methodology into question. On the contrary, provided
it proves practically feasible, there seems no doubt that
validation, in the predictive sense it is used here, is an
essential pre-requisite for any definition of model efficacy,
if not validity in a wider sense.
It appears normal these days to follow the Popperian
view of validation (Popper, 1959) and consider it as a
continuing process of falsification. Here, it is assumed
that scientific theories (models in the present context)
can never be proven universally true; rather, they are not
yet proven to be false. This perspective yields a model that
can be considered as 'conditionally valid', in the sense that
it can be assumed to represent the best theory of behaviour
currently available that has not yet been falsified. Thus,
conditional validation means that the model has proven
valid in this more narrow, predictive sense. In the rainfall-
flow context considered later, for example, it implies that,
on the basis of the new measurements of the model input
(rainfall) from the validation data set, the model produces
flow predictions that are acceptable within the uncertainty
bounds associated with the model.
Note this stress on the question of the inherent uncer-
tainty in the estimated model: one advantage of statistical
estimation, of the kind considered in this chapter, is
that the level of uncertainty associated with the model
parameters and the stochastic inputs is quantified in the
time series analysis. Consequently, the modeller should
not be looking for perfect predictability (which no one
expects anyway) but predictability that is consistent with
the quantified uncertainty associated with the model.
It must be emphasized that conditional validation is
simply a useful statistical diagnostic, which ensures that
the model has certain desirable properties. It is not a
panacea and it certainly does not prove the complete
validity of the model if, by this term, we mean the estab-
lishment of the 'truth' (Oreskes et al ., 1994). Models are, at
best, approximations of reality designed for some specific
objective; and conditional validation merely shows that
this approximation is satisfactory in this limited predic-
tive sense. In many environmental applications, however,
such validation is suffcient to establish the credibility of
the model and to justify its use in operational control,
management and planning studies.
7.3.2 Estimation(Optimization)
Once the model structure and order have been identified,
the parameters that characterize this structure need to be
estimated in some manner. There are many automatic
methods of estimation or optimization available in this
age of the digital computer. These methods range from
the simplest, deterministic procedures, usually based on
the minimization of least squares cost functions, to more
complex numerical optimization methods based on sta-
tistical concepts, such as maximum likelihood (ML). In
general, the latter are more restricted, because of their
underlying statistical assumptions, but they provide a
more thoughtful and reliable approach to statistical infer-
ence; an approach which, when used correctly, includes
the associated statistical diagnostic tests that are consid-
ered so important in statistical inference. In the present
DBM modelling context, the estimation methods are
based on optimal refined instrumental variable (RIV)
methods for transfer function models (e.g. Young, 1984,
2008 and the references therein) and nonlinear modifica-
tions of these methods.
7.3.3 Conditional validation
Validation is a complex process and even its definition
is controversial. Some academics (e.g. Konikow and Bre-
dehoeft (1992), within a ground-water context; Oreskes
et al . (1994), in relation to the whole of the Earth Sci-
ences) question even the possibility of validating models
(see also Chapter 2). To some degree, however, these
latter arguments are rather philosophical and linked,
in part, to questions of semantics: what is the 'truth';
what is meant by terms such as validation, verification
and confirmation? Nevertheless, one specific, quantitative
aspect of validation is widely accepted; namely 'predic-
tive validation' (also referred to as 'cross-validation', or
just 'validation'), in which the predictive potential of
the model is evaluated on data other than that used in
the identification and estimation stages of the analysis.
While Oreskes et al . (1994) dismiss this approach, which
Search WWH ::




Custom Search