Information Technology Reference
In-Depth Information
Remark. Of course, the factor 1 /M in front of the sum does not play any
role in the minimization of the cost function. It allows the definitions of the
average cost per example , a quantity that allows easy comparisons between
results obtained with training sets of different sizes.
The partial cost V ( z k ) must satisfy some conditions in order that the min-
imum of the cost function corresponds to appropriate weights. Weights w that
produce negative aligned fields must have a higher cost than weights produc-
ing positive aligned fields. Thus, V ( z ) must be a non-increasing function of
the aligned field z . However, that condition on V is not su cient, at least in
the case of a linearly separable training set: if w separates correctly L M ,then
any weight vector of the form aw with a> 1 is also a solution, with a lower
cost. Hence, a minimization algorithm would never converge, since the cost
can decrease without bounds by increasing the norm of w without modifying
the hyperplane orientation. To avoid this, we impose the constraint that w
be constant. Normalizations w =1and w = N + 1 in the extended space
(or w = N in input space) are the most popular ones.
The simplest method of minimizing C ( w ) is to use the algorithm of gra-
dient descent, as described in Chap. 2, which modifies iteratively the weights
following
w ( t +1)= w ( t )+∆ w ( t ) ,
with
µ ∂C ( w )
w
w ( t )=
( t )
∂V z k
∂z k
M
µ 1
M
( t ) y k x k
=
k =1
= M
c k ( t ) y k x k ,
k =1
where µ is the learning rate, and we introduced the relation ∂z k /∂ w = y k x k .
It is convenient to normalize the weights after each iteration.The last relation
shows that the weights can be written under the general form:
M
c k y k x k .
w =
k =1
The parameters c k , which are the sum of the c k ( t ) over all the iterations,
depend on the algorithm. If c k = 1 in the expresion of w , the mathemati-
cal expression of Hebb's rule is retrieved. That learning rule states that the
information used for modifying the synaptic e cacies in the nervous system
is the correlation between the activity of the pre-synaptic neuron (neuron
excitation) and of the post-synaptic neuron (neuron firing rate). It is worth
Search WWH ::




Custom Search