Neural Networks: An Overview - Neural Networks: Methodology and Applications

Information Technology Reference

In-Depth Information

Class A

Class B

-15

-10

-5

0

+5

+10

Fig. 1.22. Probability densities for classes A and B

That decision rule is satisfactory if the misclassification costs are the same

for the two classes; however, one frequently encounters applications where it

may more detrimental, or more costly, to make a false-positive misclassifica-

tion (the pattern is considered to belong to class A whereas it actually belongs

to class B ) than a false-negative misclassification (the pattern is considered

to belong to class B whereas it actually belongs to class A ). In data mining

applications for instance, a company that provides information filters may

find it more suitable to market filters that reject documents whereas they

are relevant to the chosen topic, than to market a filter that does not filter

irrelevant documents (the user spots immediately documents that are irrele-

vant, whereas he may never find out that the filter missed a relevant text).

In practice, such considerations are an important part of classifier design,

whether in pattern recognition, data mining, credit scoring, etc.). Therefore,

it is generally very desirable, in practical applications, to estimate posterior

probabilities and subsequently make decisions: classifiers that determine class

boundaries directly may lead to serious misconceptions.

The combination of Bayes formula and of Bayes decision rule is called

the Bayes classifier, which has the best achievable performance if the prior

probabilities and the likelihoods are known exactly. Since the latter condition

is not frequently fulfilled in practice, Bayes classifier is essentially of theoretical

interest. For instance, it may serve as a reference for assessing the quality of

a classifier, by applying it to an academic problem where prior probabilities

and likelihoods are known exactly.

As an illustrative example, consider a problem with two classes and one

feature; the patterns of class A are generated from a mixture of two Gaussians;

the patterns of class B are generated from a uniform distribution in a bounded

interval (Fig. 1.22). Therefore, the posterior probabilities can be computed

exactly (Fig. 1.23), and so are the boundaries between classes (Fig. 1.24). In

Fig. 1.23. Posterior probability of class A , from Bayes formula

Search WWH ::

Custom Search

Home