Information Technology Reference
In-Depth Information
where Į is the momentum constant , with the value 0.5 < Į < 0.9. The added term
represents the memorized value of the last increment so that the next weight
change keeps approximately the same direction as the last one. This stabilizes the
learning convergence.
An alternative way for speeding up and stabilizing the convergence was found
in adaptive step size implementation. Silva and Almeida (1990) recommend the
following weight update strategy
wt
()
,
wt
(
1)
K
()
t
Ct
()
ij
ij
ij
ij
where
ij tCt
() ()
are the gradient components of individual iteration steps
w
J
()
T
N
Q
Ct
()
¦
,
ij
w
w
Q
1
ij
with N as the number of training set samples. In the above updating relation,
K
ij t
()
is taken as
!
Ct
()
Ct
(
1) 0
K
()
t
c t
K
(
if
1)
ij
ij
ij
1
ij
1
K
K
( )
t
if
,
Ct
()
Ct
(
1)
0
ij
ij
ij
ij
c
1
where c is a positive constant.
To circumvent the problem of avoiding the numerous flat and steep regions of
the error surface Yu et al . (1995) advocated the dynamic learning rate to be
imbedded into the backpropagation algorithm, based on information delivered by
the first and the second derivatives of the objective function with respect to the
learning rate. The clue to the proposed strategy is that it avoids the calculation of
the values of the second derivative in weight space, using the information collected
from the training instead. To bypass the calculation of the pseudo-inverse Hessian
matrix that is inherent in second-order optimization methods, the conjugate
gradient method is used.
The overwhelming number of upgraded learning algorithms are mainly focused
on learning velocity increase and search stability improvement by adding a term
containing the derivatives in weight space. But, some improvements of both
objectives, namely of learning velocity and of convergence stabilization, are also
achievable by manipulating the parameters of the neuron transfer function. Such an
updating proposal was made for supervised pattern learning that adaptively
manipulates the learning rate by updating neuron internal nonlinearity (Zhou et al. ,
1991). Using some simulated data sets, it was shown that the updating law
proposed increases the learning speed and is very suitable for identification of
nonlinear dynamic systems.
 
Search WWH ::




Custom Search