Image Processing Reference
laws are a specific example of Dempster's combination rule and, again, probabilities
here are not additive.
The works of Bayes ( An Essay Towards Solving a Problem in the Doctrine of
Chances , 1763) later suggested a general approach that eliminated the distinction
between random probabilities and epistemological probabilities. Bayes reversed
Bernoulli's theorem, whereas Bernoulli estimated the number of successful outcomes
based on the knowledge of probability, Bayes attempted to calculate the probability
knowing the number of successful outcomes in a sample and expressed it in terms of
initial, final and likelihood probabilities. He only dealt with additive probabilities.
Bayes never tried to make his work known to others. His essay was found only
after his death, along with other works (particularly on electrically charged bodies),
which were written using abbreviations that have not all been deciphered [HOL 62].
This has led some to wonder about the exact origins of Bayes' theorem (see, for
example, Stigler's works [STI 82, STI 83], who suggests a...Bayesian solution to
This work was continued by Laplace ( Théorie analytique des probabilités , 1812).
This era saw the rapid development of inverse probabilities, seen from a subjective per-
spective. This theory underlined the distinction between the “initial” (or prior) prob-
ability of a hypothesis, the “final” probability (after the experiment), and the “likeli-
hood” probability (probability of the experiment knowing the hypothesis). Laplace
also introduced the concept of insufficient reason: some outcomes are considered
equally probable if there is no reason to think otherwise. This principle was widely
adopted until the middle of the 20 th
A.1.3. The predominance of the frequentist approach: the “objectivists”
In the 19 th and early 20 th century, because of the rapid development of physical
sciences, the modeling of human reasoning was neglected. A new discipline emerged
at that time: statistics. The concept of probability was often related to the observation
of physical phenomena, to their repetition in long sequences. The theories of Bayes
and Laplace were criticized for their subjective natures, accused of lacking rigor, and
the concept of prior probability was rejected because it seemed too vague.
The works of Cournot (1843), Ellis (1843), Venn (1866) then defined physical
probabilities, in terms of frequencies. As emphasized by Good [GOO 59], these works
were faced with problems that were impossible to solve. For example, if by flipping a
coin, we observe the following sequence of Heads (H) and Tails (T): THTHTHTHTH,
etc., we can infer that the probability of getting Tails is 1/2, but this does not mean we
can draw conclusions about the game's honesty.
One of the problems raised by these methods involves the length of the sequences
used for calculating the frequencies. They have to be long, but how long? The theories