Information Technology Reference
In-Depth Information
Notice that if the components x i of the input patterns are real-valued, the
probability that n + 1 points be on a same hyperplane is zero. In the case of
binary input components, that probability does not vanish. In the following
we restrict to points in general position.
As mentioned above, if m
n , all dichotomies of m points in general
position are linearly separable. The following computation is interesting for
m>n .Wedenoteby D ( m,n ) the number of dichotomies of L m generated by
the hyperplanes in dimension n . For points in general position, that number
depends on m and n only. We have
D ( m, 1) = 2; D (1 ,n ) = 2
for all m,n
since there are two ways of separating (by assigning classes ± 1) m points
in 1 dimension, and there are also two ways of assigning a class to a single
point in dimension n , with a hyperplane containing the origin. Consider now a
training set that includes the patterns of L m and a new point x m +1 , L m +1 =
L m
x m +1 . It may happen that two hyperplanes that generated the same
dichotomy in L m assign two different classes to x m +1 . In that case, there exists
a hyperplane H 0 , containing x m +1 , such that it generates the same dichotomy
on the patterns of L m . H 0 is ambiguous with respect to x m +1 . Then, H 0
generates a dichotomy of L m
in the space of dimension n
1 orthogonal to
H 0 . There is a one-to-one correspondence between the D ( m,n
1) dichotomies
in the space of dimension n
1 and the ambiguous dichotomies of the new
point in the n -dimensional space. Since there exist D ( m,n ) dichotomies of
L m , and each one induces two dichotomies of L m +1 , we obtain the following
recurrence [Cover 1965]:
D ( m +1 ,n )= D ( m,n )+ D ( m,n
1)
from which the expression of D ( M,N ) is obtained.
References
1. Anlauf J.K., Biehl, M. [1989], The AdaTron: An adaptive perceptron algorithm,
Europhys. Lett. 10, pp 687-692
2. Baum E.B., Haussler D. [1989], What size net gives valid generalization?, Neural
Computation 1, pp 151-160
3. Blake, C.L., Merz C.J. [1998], UCI Repository of machine learning databases,
available from http://www.ics.uci.edu/mlearn/MLRepository.html
4. Buhot A., Gordon M.B. [1997], Cost function and pattern distribution of the
Bayesian perceptron, Phys. Lett. A 228, pp 73-78
5. Buhot A., Torres Moreno J.M., Gordon M.B. [1997], Finite size scaling of the
Bayesian perceptron, Phys. Rev. E 55, pp 7434-7440
6. Buhot A., Torres Moreno J.M., Gordon M.B. [1997], Numerical simulations of
an optimal algorithm for supervised learning, European Symposium on Artificial
Neural Networks , Proceedings, M. Verleysen ed., pp 151-156
Search WWH ::




Custom Search