Environmental Engineering Reference
In-Depth Information
whatever a term is used for validation/model evaluation,
it will always tend to overstate the case for belief in the
model results. There are six reasons stated by Oreskes
et al . (1994) for this problem. First, all environmental sys-
tems are open. Logically, it is only possible to demonstrate
the truth of a closed system (although even this proposi-
tion is called into question by Godel's theorem - see the
excellent overview by Hofstadter, 1979). Secondly, there
are problems due to the presence of unknown param-
eters and the scaling of nonadditive parameters (see
below and Chapter 5). Thirdly, inferences and embedded
assumptions underlie all stages of the observation and
measurement of variables - dependent and independent
alike. Fourthly, most scientific theories are developed
by the addition of 'auxiliary hypotheses' - that is, those
not central to the principal theory, but fundamental in
a specific context for putting it into action. Thus, it is
impossible to tell whether the principal or an auxiliary
hypothesis is incorrect should deductive verification fail.
Fifthly, as we have seen, more than one model formu-
lation can provide the same output. This property is
known formally as nonuniqueness or underdetermina-
tion (the Duhem-Quine thesis - Harding, 1976). Sixthly,
errors in auxiliary hypotheses may cancel, causing incor-
rect acceptance of the model. Many modellers would
now accept that full validation is a logical impossibility
(e.g. Refsgaard and Storm, 1996, Senarath et al ., 2000).
Morton and Su arez (2001) suggest that in most practical
contexts the term 'model' can be thought of as syn-
onymous with 'theory' or 'hypothesis', with the added
implication that they are being confronted and evaluated
with data. Often, the models represent simplifications
of complex, physically based theories, analogies of other
systems, summaries of data, or representations of the
theories themselves. It is this set of approaches that allows
the provisional nature of scientific knowledge to be tested.
Conversely, it is possible for models to continue being
used for a range of reasons relating to the social, eco-
nomic and political contexts of science (Oreskes and
Berlitz, 2001).
Rykiel (1996) provides an overview of how valida-
tion has been employed in modelling, and distinguishes
(i) operational or whole-model validation (correspon-
dence of model output with real-world observations);
(ii) conceptual validation (evaluation of the underlying
theories and assumptions); and (iii) data validation (eval-
uation of the data used to test the model). He suggests that
there are at least 13 different sorts of validation procedure
that are commonly employed, explicitly or implicitly.
These procedures are:
face validation - the evaluation of whether model logic
and outputs appear reasonable;
Turing tests - where 'experts' are asked to distinguish
between real-world and model output (by analogy with
the test for artificial intelligence);
visualization techniques - often associated with a state-
ment that declares how well the modelled results match
the observed data;
comparison with other models - used for example in
general circulation model evaluations (although note
the high likelihood of developing an argument based
on circular logic here especially where different models
share a common codebase!);
internal validity - e.g. using the same data set repeat-
edly in a stochastic model to evaluate whether the
distribution of outcomes is always reasonable;
event validity - i.e. whether the occurrence and pattern
of a specific event is reproduced by the model;
historical data validation - using split-sample tech-
niques to provide a subset of data to build a model
and a second subset against which to test the model
results (see also Klemes, 1983);
extreme-condition tests - whether the model behaves
'reasonably' under extreme combinations of inputs;
traces - whether the changes of a variable through time
in the model are realistic;
sensitivity analyses - to evaluate whether changes in
parameter
values
produce
'reasonable'
changes
in
model output (see below);
multistage validation (corresponding to the stages i, ii
and iii noted above);
predictive validation - comparison of model output
with actual behaviour of the system in question; and
statistical validation - whether the range of model
behaviour and its error structure matches that of
the observed system (but see the discussion on error
propagation below).
Clearly, all of these tests provide some support for the
acceptance of a model, although some are more rigorous
than others. The more tests a model can successfully pass,
the more confidence we might have in it, although there
is still no reason to believe it absolutely for the reasons
discussed above. But in complex models, validation is
certainly a nontrivial procedure - Brown and Kulasiri
(1996: 132) note, for example that 'a model can be
considered to be successfully validated if all available
techniques fail to distinguish between field and model
data'. Any model test will in part be evaluating the
simplifications upon which the model is based, in part
Search WWH ::




Custom Search