Discrimination - Neural Networks: Methodology and Applications

Information Technology Reference

In-Depth Information

w ( t +1)= w ( t )+ y k ( t ) x k ( t )

= w ( t

1) + y k ( t ) x k ( t ) + y k ( t− 1) x k ( t− 1)

−

= ...

= t

i =1

y k ( i ) x k ( i ) ,

where we assumed that the weights were initialized with the tabula rasa op-

tion: w (0) = 0. Taking the scalar product of w ( t + 1) with the unitary vector

w ∗ , and using the above bound for

w ( t +1)

, we deduce the following lower

bound,

γ k ( i ) ( w ∗ )

w ( t +1)

≥

i =1

≥

tγ min ( w ∗ ) ,

where γ min ( w ∗ ) is the stability of the pattern with smallest stability. Since

w ∗ is a separating hyperplane, γ min ( w ∗ ) > 0.

An upper bound to

2 , can be obtained as follows:

w ( t +1)

2 =( w ( t )+ y k ( t ) x k ( t ) )

( w ( t )+ y k ( t ) x k ( t ) )

w ( t +1)

2 +2 y k ( t ) x k ( t )

y k ( t ) x k ( t )

2 .

w ( t )

w ( t )+

The cross-product is negative. As before, we have explicitly

2 +

x k ( t )

w ( t +1)

≤

w ( t )

≤

...

i =1

y k ( i ) x k ( i )

≤

2 ,

≤

x max

y k |

where we used the fact that

pertains to the example in L M

of maximal norm. Figure 6.7 illustrates the growth of the norm of w upon

learning. From the above lower and upper bounds, we obtain

=1.

x max

√ t

tγ min ( w ∗ )

≤

w ( t +1)

≤

x max

Those bounds are shown on Fig. 6.8.

6.8.2 Number of Linearly Separable Dichotomies

In this section we summarize the proof of [Cover 1965]. Consider a set L m of

m points in a space of dimension n . If no subset of n + 1 points among the m

points is linearly dependent, the m points are said to lie in general position.

Neural Networks: Methodology and Applications

Search WWH ::

Custom Search

Home