Neural Networks: An Overview - Neural Networks: Methodology and Applications

Information Technology Reference

In-Depth Information

in multi-class problems that are claimed to be di cult, the examples turn out

to be pairwise linearly separable. In such cases, powerful, elegant algorithms

give excellent solutions, as explained in Chap. 6: therefore, the first step in

the design of a classifier consists, in checking whether the classes are pairwise

linearly separable. Ho and Kashyap's algorithm [Ho 1965], which was discov-

ered long before neural networks came into existence, provides an answer to

that problem in finite time:

•

If the examples are linearly separable, the algorithm finds a separating

hyperplane in finite time.

•

If the examples are not linearly separable, the algorithm signals it infinite

time (see algorithmic complements at the end of this chapter).

The postal code database provided by the National Institute of Standards

and Technology has served as a basis for many classifier designs. It turns

out that, even with low-level representations such as a pixel representation,

the classes of examples are pairwise linearly separable [Knerr 1992]! Simi-

larly, a famous sonar signal data base has been investigated in great detail by

many authors, and many complicated classifiers were designed to solve that

two-class problem; actually, in less than ten minutes on a PC, the Ho and

Kashyap algorithm, implemented as an uncompiled Matlab program, proves

that the examples of the two classes are linearly separable. Therefore, a sim-

ple Perceptron can solve the problem, without resorting to any hidden layer.

That application is investigated in more detail in Chap. 6.

1.3.5.3 Classifier Design Methodology

From the previous section, the following strategy for the design of a neural

classifier can be derived (as discussed above, one should first ascertain that

statistical classification is relevant to the problem at hand):

•

Find an appropriate representation of the patterns to be classified, espe-

cially for pattern recognition (the techniques described in Chap. 3 can be

especially useful in that respect); this is a crucial step, since a representa-

tion that has a high discriminative power is likely to make the classification

problem trivial; this is illustrated in applications described below.

If the number of examples available for training the classifier is not larger

than the dimension of the feature vector, there is no point in pursuing the

design any further, according to Cover's theorem [Cover 1965], which is

explained in Chap. 6: before proceeding to the next steps, either a more

“compact” representation must be found, or additional examples must be

collected, or a very stringent regularization method such as weight decay

(described in Chap. 2), must be implemented.

•

For each pair of classes, select the relevant features with the feature selec-

tion methods described in Chap. 2; obviously, the discrimination of class

A from class B may not require the same features as the discrimination of

class A from class C .

Neural Networks: Methodology and Applications

Search WWH ::

Custom Search

Home