Information Technology Reference
In-Depth Information
It is to be noted that the use of a momentum term is quite usual with the
classical backpropagation algorithm, whereas this may appear to be unusual with
the Levenberg-Marquardt algorithm. However, the latter is justified, as the use of a
momentum term in the backpropagation algorithm is primarily to overcome the
possible trap at local minima and also to prevent small oscillations during the
training of the network; similarly, the use of a small momentum term, as
experimentally verified through simulation, also helps to increase network training
convergence with the Levenberg-Marquardt algorithm. Furthermore, similar to the
backpropagation algorithm, here also the Levenberg-Marquardt algorithm was
extended by adding a modified error index term, as proposed by Xiaosong et al .
(1995), to improve further the training convergence. Therefore, as per (6.20c), the
corresponding new gradient can now be expressed or defined using a Jacobian
matrix as
>
@
T
w
w
e
w
J
e
w
,
(6.25)
S
J
e
new
avg
where e(w) represents the column vector of errors, and the constant factor
J
(for the Levenberg-Marquardt algorithm) has to be chosen appropriately. Equation
(6.25) suggests that even with consideration of the modified error index extension
of the original performance function the Jacobian matrix remains unaltered and,
with the above modification, we need to add only a new error vector term
1
with the original error vector
J
ew
e
w
as we did with the back-
e
avg
propagation algorithm.
6.4.2.3.1 Computation of Jacobian Matrix
We now describe a simplified technique to compute, layer by layer, the Jacobian
matrix and the related parameters from the backpropagation results. Layer-wise or
parameter-wise computation of the Jacobian matrix is permissible because, as
stated in Equations (6.26a) and (6.26b), the final contents of the Hessian matrix
remain unaltered even if the whole Jacobian is divided into smaller parts.
Furthermore, this division of the Jacobian matrix helps to avoid computer memory
shortage problem, which is likely to occur for large neural networks.
From
» ¼
>
@
>
@
w
ª
J
º
>
@
1
2
T
T
T
V
w
|
w
J
w
w
,
w
(6.26a)
J
J
J
« ¬
1
2
w
J
2
it follows that
2
Vw
|
ª
T
w
w
T
w
w
º
J
J
J
J
¼ .
(6.26b)
¬
1
1
2
2
Computation of the Jacobian matrix is in fact the most crucial step in implementing
the Levenberg-Marquardt algorithm for neuro-fuzzy networks. For this purpose,
Search WWH ::




Custom Search