Information Technology Reference
In-Depth Information
decision theory. Their 1961 topic, Applied Statistical Decision Theory , contained a
detailed account of Bayesian analytical methods.
The modern recognition of how Bayesian and frequentist methods can
work together began in the 1980s. In 1984, Adrian Smith, a professor of statis-
tics at the University of Nottingham in England, wrote that “efficient numerical
integration procedures are the key to more widespread use of Bayesian meth-
ods.” 5 Six years later, with Alan Gelfand from the University of Connecticut,
Smith wrote a very influential paper showing that the difficult calculations
required to apply Bayesian methods to realistic problems could be estimated
using the Monte Carlo method . The Monte Carlo method is a forecasting tech-
nique applied in situations where statistical analysis is too difficult due to the
complexity of the problem. As we have seen in Chapter 5 , the method involves
running multiple trials using random variables: the larger the number of tri-
als, the better the predictions work. A technique related to the Monte Carlo
method that mathematicians call a Markov chain , employs probability to predict
sequences of events. A Markov chain, named for the Russian mathematician
Andrei Markov, is a sequence of events where the probability for each event
only depends on the event just before it. The combination of the two methods
is known as Markov chain Monte Carlo (MCMC).
A former student of Adrian Smith's, David Spiegelhalter, working for the
Medical Research Council, a government research funding agency in the United
Kingdom, wrote a program for the analysis of complex statistical models using
MCMC methods. Spiegelhalter's program generated random samples using a
method called Gibbs sampling . He released his BUGS program - an acronym for
Bayesian Inference Using Gibbs Sampling - in 1991. BUGS has since become
one of the most widely used Bayesian software packages with more than thirty
thousand downloads and applications in many different research areas rang-
ing from geology and genetics to sociology and archaeology. Spiegelhalter has
applied the Bayesian approach to clinical trials and epidemiology.
The startling growth in Bayesian applications was due both to the availabil-
ity of manageable numerical methods for estimating posteriors using MCMC
sampling and the widespread availability of powerful desktop computers. In
our examples, we have only considered simple problems with few variables.
In real problems, statisticians typically look to find relationships among large
numbers of variables. At the end of the 1980s, there was a breakthrough in
applying Bayesian methods. Turing Award recipient Judea Pearl ( B.14.3 ) showed
that Bayesian networks , graphical representations of a set of random variables
and the probability of two events occurring together, were a powerful tool
for performing complex Bayesian analyses. A very simple Bayesian network is
shown in Figure 14.2 .
As a final example of the advances made by Bayesian analysis, David
Heckerman, a machine-learning researcher at Microsoft Research, says, “The
whole thing about being a Bayesian is that all probability represents uncer-
tainty and, anytime you see uncertainty, you represent it with probability.
And that's a whole lot bigger than Bayes' theorem.” 6 For his PhD thesis at
Stanford University, Heckerman introduced Bayesian methods and graphical
networks into expert systems to capture the uncertainties of expert knowl-
edge. His “probabilistic expert system” was called Pathfinder and was used to
assist medical professionals in diagnosing lymph node disease. At Microsoft
SPRINKLER
RAIN
GRASS WET
Fig. 14.2. A simple Bayesian network
showing the structure of the joint prob-
ability distribution for rain, sprinkler,
and grass. The diagram captures the fact
that rain influences whether the sprin-
kler is activated, and both rain and the
sprinkler influence whether the grass
is wet. This is an example of a directed
acyclic graph.
B.14.3. Judea Pearl received the
Turing Award in 2011 for develop-
ing a calculus for casual reasoning
based on Bayesian belief networks.
This new approach allowed the prob-
abilistic prediction of future events
and also the selection of a sequence
of actions to achieve a given goal.
His theoretical framework has given
strong momentum to the renewed
interest in AI among the computer
science community.
Search WWH ::




Custom Search