Information Technology Reference
In-Depth Information
Notice that if the components
x
i
of the input patterns are real-valued, the
probability that
n
+ 1 points be on a same hyperplane is zero. In the case of
binary input components, that probability does not vanish. In the following
we restrict to points in general position.
As mentioned above, if
m
n
, all dichotomies of
m
points in general
position are linearly separable. The following computation is interesting for
m>n
.Wedenoteby
D
(
m,n
) the number of dichotomies of
L
m
generated by
the hyperplanes in dimension
n
. For points in general position, that number
depends on
m
and
n
only. We have
≤
D
(
m,
1) = 2;
D
(1
,n
) = 2
for all
m,n
since there are two ways of separating (by assigning classes
±
1)
m
points
in 1 dimension, and there are also two ways of assigning a class to a single
point in dimension
n
, with a hyperplane containing the origin. Consider now a
training set that includes the patterns of
L
m
and a new point
x
m
+1
,
L
m
+1
=
L
m
∪
x
m
+1
. It may happen that two hyperplanes that generated the same
dichotomy in
L
m
assign two different classes to
x
m
+1
. In that case, there exists
a hyperplane
H
0
, containing
x
m
+1
, such that it generates the same dichotomy
on the patterns of
L
m
.
H
0
is ambiguous with respect to
x
m
+1
. Then,
H
0
generates a dichotomy of
L
m
in the space of dimension
n
−
1 orthogonal to
H
0
. There is a one-to-one correspondence between the
D
(
m,n
−
1) dichotomies
in the space of dimension
n
1 and the ambiguous dichotomies of the new
point in the
n
-dimensional space. Since there exist
D
(
m,n
) dichotomies of
L
m
, and each one induces two dichotomies of
L
m
+1
, we obtain the following
recurrence [Cover 1965]:
−
D
(
m
+1
,n
)=
D
(
m,n
)+
D
(
m,n
−
1)
from which the expression of
D
(
M,N
) is obtained.
References
1. Anlauf J.K., Biehl, M. [1989], The AdaTron: An adaptive perceptron algorithm,
Europhys. Lett.
10, pp 687-692
2. Baum E.B., Haussler D. [1989], What size net gives valid generalization?,
Neural
Computation
1, pp 151-160
3. Blake, C.L., Merz C.J. [1998], UCI Repository of machine learning databases,
available from http://www.ics.uci.edu/mlearn/MLRepository.html
4. Buhot A., Gordon M.B. [1997], Cost function and pattern distribution of the
Bayesian perceptron,
Phys. Lett. A
228, pp 73-78
5. Buhot A., Torres Moreno J.M., Gordon M.B. [1997], Finite size scaling of the
Bayesian perceptron,
Phys. Rev. E
55, pp 7434-7440
6. Buhot A., Torres Moreno J.M., Gordon M.B. [1997], Numerical simulations of
an optimal algorithm for supervised learning,
European Symposium on Artificial
Neural Networks
, Proceedings, M. Verleysen ed., pp 151-156
Search WWH ::
Custom Search