Geoscience Reference
In-Depth Information
an adaptive search strategy will seek out those areas of higher likelihood and concentrate the
sampling in those areas with a local sampling density proportional to the local magnitude of
the likelihood.
This is effectively a process of trying to define the shape of a likelihood surface in the
parameter space, concentrating on the peaks in that surface. Those peaks are easiest to find
if they are not too localised and if they are close to the prior regions of high likelihood. This
is not always the case, however, and one of the important features of an adaptive sampling
scheme is that it should not concentrate exclusively on those areas of high likelihood that
have already been found but should continue to sample other parts of the space in case there
are other areas of high likelihood that have not yet been found. Different search algorithms
have different ways of doing this. In high-dimensional spaces of complex structure there is
still a possibility of missing areas of high likelihood just because they never get sampled. It
has also recently been appreciated that the complexity of the surface, and consequently the
difficulty of finding areas of high likelihood, might also be the result of the numerics of a
particular model implementation (see, for example, Schoups et al. , 2008, 2010; Kavetski and
Clark, 2010). Much more on sampling and search techniques is discussed by Beven (2009).
Note that the techniques presented in the following discussion can be extended to multiple
model structures.
An important issue that arises in the implementation of such methods is the generation
of random numbers. Since it might be necessary to generate very large numbers of random
numbers, the particular characteristics of different random number schemes might have an
effect on the sampling. In fact, there are no randomnumber schemes on digital computers, there
are only algorithms for generating pseudo-random numbers (see Beven, 2009). As such there
are some good algorithms (with long repeat sequences and low correlation) and some poor
algorithms (including some of the standard calls made available in programming languages).
Even the choice of parameters for a particular algorithm can have an important impact on
the apparent randomness of the resulting sequences. A modern algorithm that, with suitable
parameters, can give an enormous period before repeating is the Mersene Twister (Matsumoto
and Nishimura, 1998).
B.7.3.1 Structured Sampling for Forward Uncertainty Analysis
In a forward uncertainty analysis, the areas of high likelihood in the space to be sampled will
be known explicitly from the definition of the prior distributions for the parameter values.
This will include any correlations between different parameters if necessary (either specified
as a covariance matrix if the distributions are assumed to be multivariate Gaussian or, more
generally, in terms of copula functions Beven, 2009). An obvious sampling strategy is then to
sample each parameter randomly in a way that reflects the probability density of that parameter
(as well as any effects of covariation). However, even if only 10 samples are taken per parameter
this very rapidly builds up to a large number of samples when there are more than a small
number of parameters.
More efficient, if more approximate, sampling techniques have been proposed of which
the most commonly used is Latin Hypercube sampling (Iman and Conover, 1982). In this
approach the number of samples is decided beforehand. Each parameter axis is then split into
the same number of values but using a cumulative probability scale so that for each parameter
each sample will be of equal probability. For the simplest case of independent parameters
(and it is often very difficult to specify parameter correlations a priori except in some special
circumstances), then a value of each parameter is chosen randomly to create a parameter set.
The next sample chooses a new set of values, but without replacement, so that once all the
samples have been chosen all the parameter values have been used. Iman and Connover (1982)
show how rank correlation between parameters can be introduced into the Latin Hypercube
sampling process.
 
Search WWH ::




Custom Search