Information Technology Reference
In-Depth Information
y k =+1
y k =-1
2
(b)
(a)
1
0
k
d k
σ
-1
-2
training set
training set
ε g =0.22
-3
0
40
80
120
160
200
0
40
80
120
160
200
pattern number k
pattern number k
Fig. 6.14. Distance of the patterns to the separating hyperplane, with different
colors for the different classes. The sign of d µ represents the class assigned by the
perceptron after learning. Left : learning with the M = 104 first patterns. The last
G = 104 examples belong to the test set. Right : distances to the hyperplane deter-
mined with all the patterns in the database, showing that they are linearly separable
Fig. 6.15. Histogram of the stabilities of the examples with respect to the hyper-
plane that separates the complete set of patterns
the training examples stem from noisy measurements of the corresponding
physical signals, those distances allow assigning a degree of plausibility (or a
probability density) to the perceptron output.
Remark 1. The fact that the 208 patterns of the sonar database turned out
to be linearly separable is not surprising. A theorem due to Cover [Cover
1965], and later generalized by E. Gardner (see [Gardner 1989]) to the case
of correlated data [Engel et al. 2001], states that the probability that a set
of points in general position (that is, such that no subset of more than N
points lie on one hyperplane) is linearly separable only depends on the ratio
M/N ,where M is the number of points and N the space dimension. In par-
ticular, for N =60and M = 208, and if the patterns are correlated, as is the
 
Search WWH ::




Custom Search