Geology Reference
In-Depth Information
This methodology means that as new data become
available, the posterior parameter distributions
and prediction uncertainties can be continuously
refined. If any of the sampled parameter sets con-
tinuously provide good simulations for available
data (events) they will be retained as simulators
of the observed system, but importantly their
'weight' will evolve dependent on the total number
of simulations left and their individual ranking
regarding their overall predictive capability.
Selecting a model performance measure and a
threshold of model rejection is a difficult, some-
times arbitrary procedure. Mismatches between
simulations and observations need to be tolerated
when accommodating the effects of input data
uncertainty in conjunction with other data uncer-
tainties not accounted for (see discussion in
Beven, 2006). Simulation performance is often in
the eye of the beholder: what appears to be a good
simulation to one model user is a poor simulation
to another, depending on the level of confidence
they are willing to accept in model output. Also,
since models often fail to simulate all aspects
the system dynamics equally well, the choice of
performance measure reflects a judgment on
what are the most important dynamics to get
right. These considerations should reflect the pri-
orities of the model application and be explicitly
defined. We suggest that objective model bench-
marks are often missing when setting up perform-
ance metrics, and we feel that a more thoughtful
approach should be applied. Ultimately the user
of the model predictions needs to define the per-
formance limits for the model simulations.
Therefore performance measures should, where
objective assessments can be made, consider the
quality of the observed data and so the potential
uncertainties in the data series. In this study we
are explicit about the model performance and its
relation to the observed data for which we have
quantified the uncertainties from field measure-
ments where possible. To assess model perform-
ance we used a time-step based absolute deviation
between simulations and observations of both
discharge ( Q ) and suspended sediment concen-
tration ( SS ). In the case of Q , this deviation was
normalized by the measurement uncertainty
interval for each time-step which served as a
model-independent benchmark. No such bench-
mark was available, or could reasonably be
assumed, for SS . Being pragmatic, we therefore
apply more typical forms of model performance
when comparing with the SS information, but
ensuring we are more relaxed about the quality
of the model fit because we may mis-specify the
error characteristics in the data. The distribution
of deviations across the time-steps was summa-
rized by mean deviations and percentiles. For Q ,
the mean absolute deviation - || (Table 5.4) was
ultimately used as the performance measure
sought to be minimized. This implies the objec-
tive of seeking model simulations that are, on
average, close to or within the observed intervals
across all time-steps, although occasional large
deviations may be tolerated. This is a reasonable
starting objective for the purpose of this study,
although additional performance measures
should ideally be used to aid model diagnostics
(Freer et al ., 1996; Clark et al ., 2008). The corre-
sponding performance measure for SS without
measurement of uncertainty intervals is the
mean absolute error MAE (Table 5.4). Performance
thresholds for both measures were set initially
for the purpose of visual model diagnostics, which
evaluated whether the retained model realiza-
tions could be considered 'behavioural' in the
sense of Beven (2006). For further discussion of
this type of exploratory analysis, the reader is
referred to Krueger et al . (2009). The model eval-
uation against the Q and SS data was carried out
sequentially, so that the sampled parameter sets
were first rejected or retained based on the - ||
performance threshold of 0.4 for Q , and the
remaining sets were then updated in the same
way based on the MAE performance threshold of
150 for SS . This reflects our understanding of
hydrology being the dominant driver of erosion,
and hence hydrology should be modelled suffi-
ciently well before further erosion processes are
evaluated to ensure that the model simulates
erosion 'for the right reasons' (Quinton, 1994;
Brazier et al ., 2000). The parameter sets retained
for the first event were further updated by condi-
tioning on the second event.
Search WWH ::




Custom Search