Information Technology Reference
In-Depth Information
y
k
=+1
y
k
=-1
2
(b)
(a)
1
0
k
d
k
σ
-1
-2
training set
training set
ε
g
=0.22
-3
0
40
80
120
160
200
0
40
80
120
160
200
pattern number k
pattern number k
Fig. 6.14.
Distance of the patterns to the separating hyperplane, with different
colors for the different classes. The sign of
d
µ
represents the class assigned by the
perceptron after learning.
Left
: learning with the
M
= 104 first patterns. The last
G
= 104 examples belong to the test set.
Right
: distances to the hyperplane deter-
mined with all the patterns in the database, showing that they are linearly separable
Fig. 6.15.
Histogram of the stabilities of the examples with respect to the hyper-
plane that separates the complete set of patterns
the training examples stem from noisy measurements of the corresponding
physical signals, those distances allow assigning a degree of plausibility (or a
probability density) to the perceptron output.
Remark 1.
The fact that the 208 patterns of the sonar database turned out
to be linearly separable is not surprising. A theorem due to Cover [Cover
1965], and later generalized by E. Gardner (see [Gardner 1989]) to the case
of correlated data [Engel et al. 2001], states that the probability that a set
of points in general position (that is, such that no subset of more than
N
points lie on one hyperplane) is linearly separable only depends on the ratio
M/N
,where
M
is the number of points and
N
the space dimension. In par-
ticular, for
N
=60and
M
= 208, and if the patterns are correlated, as is the
Search WWH ::
Custom Search