Database Reference
In-Depth Information
classifier). Inducers that can construct probabilistic classifiers are known
as probabilistic inducers. In decision trees, it is possible to estimate
the conditional probability P DT ( S ) ( y = c j |
a i = x q,i ; i =1 ,...,n )of
an observation x q . Note the addition of the “hat” — ˆ — to the
conditional probability estimation is used for distinguishing it from the
actual conditional probability.
In classification trees, the probability is estimated for each leaf sepa-
rately by calculating the frequency of the class among the training instances
that belong to the leaf.
Using the frequency vector as is, will typically over-estimate the
probability. This can be problematic especially when a given class never
occurs in a certain leaf. In such cases we are left with a zero probability.
There are two known corrections for the simple probability estimation which
avoid this phenomenon. The following sections describe these corrections.
3.4.1
Laplace Correction
AccordingtoLaplace's law of succession [ Niblett (1987) ] , the probability of
the event y = c i where y is a random variable and c i is a possible outcome
of y which has been observed m i times out of m observations is:
m i + kp a
m + k
,
(3.3)
where p a is an apriori probability estimation of the event and k is the
equivalent sample size that determines the weight of the apriori estimation
relative to the observed data. According to [ Mitchell (1997) ] , k is called
“equivalent sample size” because it represents an augmentation of the m
actual observations by additional k virtual samples distributed according to
p a . The above ratio can be rewritten as the weighted average of the apriori
probability and a posteriori probability (denoted as p p ):
m i + k
·
p a
m + k
m i
m ·
m
m + k + p a ·
k
m + k
=
(3.4)
m
m + k + p a ·
k
m + k
= p p ·
= p p ·
w 1 + p a ·
w 2 .
In the case discussed here, the following correction is used:
y = c j )= σ y = c j AND a i = x q,i S + k
·
p
P Laplace ( a i = x q,i
σ y = c j S + k
|
.
(3.5)
Search WWH ::




Custom Search