Markov Chain Monte Carlo Methods: Theory and Applications - Data Assimilation for Atmospheric, Oceanic and Hydrologic Applications

Geoscience Reference

In-Depth Information

accepted moves to proposed moves should neither be close to 0 or close to 1. If

the target distribution is multivariate Gaussian, then the optimal acceptance rate

is 0.23 ( Gelman et al. 2004 ) and a value of approximately 20 % has been shown

to work well for non-Gaussian posterior PDFs as well ( Geyer and Thompson

1995 ). Roberts et al. ( 1997 ) determined that, in the limit of large dimensional state

space and for posterior distribution in which each variable is i.i.d., the optimal

acceptance rate is precisely 23.4 %. Roberts and Rosenthal ( 2001 ) showed that the

optimal multivariate Normal proposal covariance

† p should be proportional to the

covariance

†

of the target distribution;

† p D k†

. Given a multivariate Normal

target density with multivariate covariance

†

and dimension

d

, the optimal Normal

proposal covariance is

.2:38/ 2

d

† p D

†:

(3.8)

is typically not known a priori and (2) there

is no guarantee the target distribution is multivariate Normal. Though care must be

taken not to blindly tune to the “optimal” acceptance rate, the theory laid out in

Roberts et al. ( 1997 )and Roberts and Rosenthal ( 2001 ) serves as a useful starting

point.

In practice, the following procedure has proven to work well for most problems:

Of course, the difficulty is that (1)

†

(i) Run a pilot MCMC chain that generates an ensemble of realizations of

P.

y j x

/

and compute an approximate

†

, assuming the posterior is multivariate Normal.

† p from ( 3.8 ) above.

(iii) Monitor the acceptance rate in the early stages of the algorithm and ensure it

stabilizes between 10 and 60 %.

(ii) Construct an initial

If the target distribution is far from multivariate Gaussian, then the result will still

be slow mixing, but the chain will be more efficient than if left un-tuned.

The question remains: in the absence of knowledge of the true covariance of

the target distribution, how can one optimally choose the proposal covariance?

Adaptive algorithms are the most commonly used solution to this problem and

are nearly uniformly used during a period that is referred to by most authors as

“burn-in”, though the term burn-in is rather confusing as it has been applied both

to the practice of proposal tuning and rejection of the initial portion of the chain

(see Sect. 3.3.2 below). Increasingly, adaptive algorithms are used over the length of

the chain (e.g., Haario et al. 2001 , 2006 ; Roberts and Rosenthal 2007 , 2009 ; Vrugt

et al. 2009 ; Vrugt and Ter Braak 2011 ), but a full discussion of this topic is beyond

the scope of this paper. Adaptive proposal tuning is typically done as follows. The

user first selects a proposal covariance (typically diagonal) under the assumption

that the individual parameter proposal variances are proportional to the realistic

range of values of each. A starting point for the chain is selected and Metropolis-

Hastings sampling commences. After a set of n iterations, the sample covariance

† n is computed and the proposal covariance matrix is updated using these values.

The new proposal covariance is then held fixed for the next m iterations, after which

the most recent set of n values is used to produce an updated proposal covariance.

Data Assimilation for Atmospheric, Oceanic and Hydrologic Applications

Search WWH ::

Custom Search

Home