Information Technology Reference
In-Depth Information
Fig. 1.21. A geometrical interpretation of Bayes decision rule; the gray area is
the probability of misclassification when Bayes rule is used; the striped area is the
increase of misclassification probability resulting from a different boundary choice
the likelihood should be denoted as p X ( x
|
C i , O ), and the posterior proba-
bilities should be denoted as Pr( C i
x,O ), since their estimates depend on
the observation set O . For simplicity, we will not use these notations, but it
should be remembered that the estimates of probabilities and of probability
distributions are always conditioned to the observations from which they are
estimated.
|
1.3.4 Bayes Decision Rule
When assigning a pattern to a class, the risk of making a classification error
is minimum if the pattern is assigned to the class whose posterior probability
is highest.
Consider a classification problem with two classes C 1 and C 2 , and one
feature. Clearly, the probability of misclassification is larger if the pattern
lies close to the class boundary. However, during the normal operation of
the classifier, it will handle patterns that are described by a large range of
values of the feature, so that what one would really like to do is to find the
boundary that minimizes the global error probability, i.e., the boundary that
minimizes the quantity Pr( M )= +
−∞
x ) p X ( x )d x ,where M denotes
the event “misclassification”. Since the probability density p X ( x ) is positive,
the integral is minimal if Pr( M
Pr( M
|
x )isthe
posterior probability of C 1 if the decision is made of assigning the pattern to
C 2 , and the posterior probability of C 2 if the decision is made of assigning the
pattern to C 1 . Therefore, Pr( M
|
x ) is minimal for all x .Pr( M
|
x ) is minimized if the decision is to assign
the pattern to the class with higher probability.
A geometrical interpretation of that argument is shown on Fig. 1.21: if
Bayes rule is used, the misclassification probability is represented by the gray
area. Any other boundary choice would increase that area.
The result can be easily extended to the multi-class case and the multi-
feature case.
|
Search WWH ::




Custom Search