Information Technology Reference
In-Depth Information
line is present, all three sources are likely to be acti-
vated. The hypothesis represented by this detector is
that a vertical line is actually present in the world. We
will represent this hypothesis with the variable h ,which
is 1 if th e h ypothesis is true, and 0 if not. The null hy-
pothesis h is that a vertical line is not present in the
world, and is just th e complement (opposite) of h ,so
that if h =1 ,then h =0 an d v ice-versa. Another way
of putting this is that h and h are mutually exclusive
alternatives: their summed probability is always 1.
To compute objective, frequency-based probabili-
ties (as opposed to subjective probabilities) for our ex-
ample, we need a table of states of the world and their
frequencies of occurring. Figure 2.21 shows the table of
states that will define our little world for the purposes
of this example. Each state consists of values for all
of the variables in our world: the two hypotheses and
the three data inputs, together with a frequency asso-
ciated with each state that determines how many times
this state actually occurs in the world. These frequen-
cies can be seen as a simple shorthand for the objective
probabilities of the corresponding events. For example,
the events that have a frequency of 3 have an objective
probability of 3=24 or .125, where we simply divide by
the total frequency over all events (24) to convert from
frequency to probability.
There are three basic probabilities that we are inter-
ested in that can be computed directly from the world
state table. The first is the overall probability that the
hypothesis h is true, which is written P (h =1) or just
The third probability that we need to know is just the
intersection of the first two. This is also called the joint
probability of the hypothesis and the data, and is writ-
ten P (h =1;d=110) or P (h; d) . Figure 2.22c shows
that this is 2/24 (.083).
Our detector is primarily interested in how predictive
the data is of the hypothesis being true: it gets some
inputs and it wants to know if something is really “out
there” or not. The joint probability of the hypothesis
and the data clearly seems like an important quantity in
this regard, as it indicates how often the hypothesis and
data occur together. However, it doesn't quite give us
the right information — if we actually got input data
of 110 , we would tend to think that the hypothesis is
quite likely to be true, but P (h =1;d = 110) is only
2/24 or .083, not a particularly large probability. The
problem is that we haven't properly scoped (limited, re-
stricted — think of a magnifying scope zooming in on
a subset of the visual field) the space over which we are
computing probabilities — the joint probability tells us
how often these two co-occur compared to all other pos-
sible states, but we really just want to know how often
the hypothesis is true when we receive the particular in-
put data we just got. This is given by the conditional
probability of the hypothesis given the data, which is
written as P (hjd) , and is defined as follows:
(2.23)
So, in our example where we got d=1 1 0 ,wewantto
know
for short. This can be computed in the same way
as we compute the probability for a single event in the
table — just add up all the frequencies associated with
states that have h =1 in them, and divide the result by
the total frequency (24). This computation is illustrated
in figure 2.22a, and gives a result of 12/24 or .5. The
next is the probability of the current input data (what
we are receiving from our inputs at the present time).
To compute this, we need to first pick a particular data
state to analyze. Let's choose d = 110 , which is the
case with the first two inputs are active. Figure 2.22b
shows that the probability of this data state ( P (d = 110)
or P (d) for short) is 3/24 (.125) because this condition
occurs 1 time when the hypothesis is false, and 2 times
when it is true in our little world.
(2.24)
which is (2/24) / (3/24), or .67 according to our table.
Thus, matching our intuitions, this tells us that having 2
out of 3 inputs active indicates that it is more likely than
not that the hypothesis of a vertical line being present is
true. The basic information about how well correlated
this input data and the hypothesis are comes from the
joint probability in the numerator, but the denominator
is critical for scoping this information to the appropriate
context (cases where the particular input data actually
occurred).
Search WWH ::




Custom Search