Information Technology Reference
In-Depth Information
Class A
Class B
-15
-10
-5
0
+5
+10
Fig. 1.22. Probability densities for classes A and B
That decision rule is satisfactory if the misclassification costs are the same
for the two classes; however, one frequently encounters applications where it
may more detrimental, or more costly, to make a false-positive misclassifica-
tion (the pattern is considered to belong to class A whereas it actually belongs
to class B ) than a false-negative misclassification (the pattern is considered
to belong to class B whereas it actually belongs to class A ). In data mining
applications for instance, a company that provides information filters may
find it more suitable to market filters that reject documents whereas they
are relevant to the chosen topic, than to market a filter that does not filter
irrelevant documents (the user spots immediately documents that are irrele-
vant, whereas he may never find out that the filter missed a relevant text).
In practice, such considerations are an important part of classifier design,
whether in pattern recognition, data mining, credit scoring, etc.). Therefore,
it is generally very desirable, in practical applications, to estimate posterior
probabilities and subsequently make decisions: classifiers that determine class
boundaries directly may lead to serious misconceptions.
The combination of Bayes formula and of Bayes decision rule is called
the Bayes classifier, which has the best achievable performance if the prior
probabilities and the likelihoods are known exactly. Since the latter condition
is not frequently fulfilled in practice, Bayes classifier is essentially of theoretical
interest. For instance, it may serve as a reference for assessing the quality of
a classifier, by applying it to an academic problem where prior probabilities
and likelihoods are known exactly.
As an illustrative example, consider a problem with two classes and one
feature; the patterns of class A are generated from a mixture of two Gaussians;
the patterns of class B are generated from a uniform distribution in a bounded
interval (Fig. 1.22). Therefore, the posterior probabilities can be computed
exactly (Fig. 1.23), and so are the boundaries between classes (Fig. 1.24). In
Fig. 1.23. Posterior probability of class A , from Bayes formula
Search WWH ::




Custom Search