Parameter Estimation and Predictive Uncertainty - Rainfall-Runoff Modelling: The Primer

Geoscience Reference

In-Depth Information

B7.3.2 Simple Monte Carlo Sampling

The sampling problem is much more difficult when we do not know the shape of the likelihood

surface before starting sampling. There may also be little prior information to guide the search,

since it can be difficult to estimate beforehand what effective values of the parameters (and

their potential interactions) might be needed to get good results from a model in matching the

available observations. It is quite common, for example, to only estimate some prior upper

and lower limits for particular parameters, without any real idea of what distribution to assume

between those limits.

Without such prior information there is initially little to guide the sampling strategy. So a

rather obvious choice of strategy is to use simple Monte Carlo sampling, in which random

values of each parameter are chosen independently across the specified ranges. Where prior

information is available, this is easily modified to samples of equal probability by sampling

across the cumulative probability range; this will result in a sampling density proportional to

the prior probability. The only problem with this very simple strategy is that of taking enough

samples. Similar issues arise as for the forward uncertainty estimation problem, except that

now we do not know where to concentrate the search. If only a small number of samples are

taken, areas of higher likelihood on the surface might be missed. The method is therefore only

useful where model runs that might be retained as behavioural (higher likelihood) are spread

through the parameter space. Otherwise, very large numbers of samples might be required to

define the shape of local areas of high likelihood. However, it is the commonly the case where

non-statistical likelihood measures are used within the GLUE methodology and many GLUE

applications have used this type of simple Monte Carlo sampling. Refinement of the simple

sampling strategy is also possible by discretising the space into areas where high likelihoods

and low likelihoods have been found by some initial search and then concentrating sampling

in the sub-spaces of high likelihood. A variety of methods are described by Beven and Binley

(1992), Spear et al. (1994), Shorter and Rabitz (1997), Bardossy and Singh (2008) and Tonkin

and Doherty (2009).

B.7.3.3 Importance Sampling: Monte Carlo Markov Chain

As noted elsewhere, statistical likelihood functions tend to stretch the response surface greatly,

resulting in one or more areas of high likelihood that are highly localised. This means that

simple Monte Carlo search algorithms would be highly inefficient for such cases and a more

directed strategy is needed to define the shape of the surface with any degree of detail. Most

strategies of this type are adaptive in the sense of using past samples to guide the choice of

new samples and have the aim of finishing with a set of samples that are distributed in the

parameter space with a density that is directly proportional to the local likelihood. This is a

form of importance sampling. The most widely used techniques for importance sampling in

hydrological modelling are those of the Monte Carlo Markov Chain (MC 2 ) family. They have

been used in rainfall-runoff modelling at least since Kuczera and Parent (1998).

The concept that underlies MC 2 sampling is quite simple to understand (see also Beven,

2009). The scheme starts with a set of random samples chosen according to some proposal

scheme. Each chosen point represents a parameter set. The model is run with that parameter

set and a posterior likelihood for that point is calculated. A new set of points around that

point is then chosen, consistent with the proposal distribution. Whether a model run is made

at the new point depends on the likelihood of the original point and a random number, so

that there is a probability of making a run even if the likelihood of the original point was low.

This is to guard against not sampling regions of the space where a new high likelihood area

might be found. Once the sampling is complete, the chain is checked to see if the sample of

points is converging on a consistent posterior distribution. If not, another iteration is carried

out, which might involve adapting the proposal distribution to refine the sampling. The process

is effectively a chain of random walks across the likelihood surface, where the probability of

Rainfall-Runoff Modelling: The Primer

Search WWH ::

Custom Search

Home