Geoscience Reference
In-Depth Information
informal measure for periods of “similar” data (when the asymptotic limit on learning from
new data might be justified) but to combine evaluations for periods of different characteristics,
or from different types of observations, using one of the methods in Box 7.2.
B7.1.4 Fuzzy Measures for Model Evaluation
There have been some attempts to take a different approach to model evaluation based on
non-statistical measures. One approach is that of using fuzzy measures (Franks et al. , 1998;
Aronica et al. , 1998; Pappenberger et al. , 2007b; Jacquin and Shamseldin, 2007). It has been
used particularly in situations where observational data are scarce and statistical measures
might be difficult to evaluate. The basic fuzzy measure can be defined as a simple function
of the error between observed and predicted variables, such as discharge. If the error is zero
(or within a certain range of zero) then the fuzzy measure is assumed to be at a maximum,
say unity. The measure then declines to zero as the error gets larger in some defined way. A
linear decline towards zero at some maximum allowable error is often assumed. The maximum
allowable error might be different for overprediction compared with underprediction. It might
also be different for different time steps or variables. The individual fuzzy measures for different
time steps or variables may then be combined in some way (linear addition, weighted addition,
multiplication, fuzzy union or fuzzy intersection; see Box 7.2) to create an overall measure
for model evaluation. Such measures can be treated as likelihood measures within the GLUE
methodology and arise naturally in the “limits of acceptability” approach to model evaluation
suggested by Beven (2006a). They have no foundation in statistical theory but they can be used
to express likelihood as a measure of belief in the predictions of a model.
Within the limits of acceptability approach, one way of making evaluations against different
types and magnitudes of observations comparable is to scale the performance of a model in
terms of a standardised score (Figure B7.1.2). The score has the value of zero at the observed
value,
1 at the upper limit. Overprediction therefore
leads to positive scores and underprediction to negative scores. Series of these standardised
scores are shown for an application of dynamic TOPMODEL in Figure 7.14. These scores can
1 at the lower limit of acceptability and
+
Figure B7.1.2 Residual scores.
 
Search WWH ::




Custom Search