Discrimination - Neural Networks: Methodology and Applications - page 352

Information Technology Reference

In-Depth Information

y k =+1

y k =-1

2

(b)

(a)

1

0

k

d k

σ

-1

-2

training set

training set

ε g =0.22

-3

0

40

80

120

160

200

0

40

80

120

160

200

pattern number k

pattern number k

Fig. 6.14. Distance of the patterns to the separating hyperplane, with different

colors for the different classes. The sign of d µ represents the class assigned by the

perceptron after learning. Left : learning with the M = 104 first patterns. The last

G = 104 examples belong to the test set. Right : distances to the hyperplane deter-

mined with all the patterns in the database, showing that they are linearly separable

Fig. 6.15. Histogram of the stabilities of the examples with respect to the hyper-

plane that separates the complete set of patterns

the training examples stem from noisy measurements of the corresponding

physical signals, those distances allow assigning a degree of plausibility (or a

probability density) to the perceptron output.

Remark 1. The fact that the 208 patterns of the sonar database turned out

to be linearly separable is not surprising. A theorem due to Cover [Cover

1965], and later generalized by E. Gardner (see [Gardner 1989]) to the case

of correlated data [Engel et al. 2001], states that the probability that a set

of points in general position (that is, such that no subset of more than N

points lie on one hyperplane) is linearly separable only depends on the ratio

M/N ,where M is the number of points and N the space dimension. In par-

ticular, for N =60and M = 208, and if the patterns are correlated, as is the

Next Page

Neural Networks: Methodology and Applications

Search WWH ::

Custom Search

Home