Information Technology Reference
In-Depth Information
(4 cards divided by 52). If two cards are drawn at random, the probability of the
second card being an ace depends on whether the first card was an ace. If it was,
then the probability of the second card being an ace is 0.058 (3 divided by 52).
If it wasn't, then the probability remains at 0.077.
Cox showed that quantifying beliefs numerically and requiring logical and
consistent reasoning leads to exactly the same rules for beliefs as for physical
probabilities. So there was no dispute about the validity of Bayes' Rule between
frequentists and Bayesians. Instead, the controversy was about using subjec-
tive beliefs in data analysis rather than just using frequentist probabilities. The
importance of Bayes' Rule for data analysis is apparent if proposition X is a
hypothesis - that is, an idea or explanation - and Y is experimental data :
Prob (hypothesis | data and B) ~ Prob (data | hypothesis and B) ×
Prob (hypothesis |B)
The symbol ~ means that the left-hand side is proportional to the right-hand
side of the equation. In other words, the probability of the hypothesis being
true given the data is proportional to the probability that we would have
observed the measured data if the hypothesis was true. The second factor on
the right-hand side is the probability of our hypothesis, Prob (hypothesis
| B) . This is the prior probability and represents our state of belief before
we have included the measured data. By Bayes' Rule, we see that this prior
probability is modified by the experimental measurements using the quan-
tity Prob (data | hypothesis and B) - known as the likelihood function . This
gives us the posterior probability, Prob (hypothesis | data and B) , which is
our new belief in the hypothesis, after taking into account the new data. The
likelihood function uses a statistical model that gives the probability of the
observed data for various values of some unknown parameter. For estimating
the parameters of a model we can ignore the denominator, Prob (data | B) ,
because it is just a scaling factor that does not depend explicitly on the
hypothesis. However, for model comparisons, the denominator is important
and is called the evidence .
As Sharon McGrayne shows in her topic The Theory That Would Not Die ,
Bayesian reasoning about uncertainty persisted in some unlikely places, even in
times when the frequentists were in the ascendancy. In 1918, Albert Whitney,
who had taught insurance mathematics at the University of California, Berkeley,
invented credibility theory , a Bayesian method for pricing insurance premiums
by assigning weights to the available evidence based on its believability. In the
1930s, Cambridge geophysicist Harold Jeffreys studied earthquakes and tsuna-
mis from a Bayesian point of view and published a classic text on the Theory of
Probability in 1939, just before World War II broke out.
During World War II, a Bayesian approach to uncertainty played a deci-
sive role in winning the battle against the German U-boats that were sinking
vital U.K. supply ships in the North Atlantic. Alan Turing, working at the top-
secret Bletchley Park code-breaking site, introduced Bayesian methods to help
decipher the German Navy's messages encrypted using the Enigma machine.
Soon after his arrival at Bletchley Park, Turing helped automate the process
of searching through the huge number of possible Enigma settings. With
Search WWH ::




Custom Search