Information Technology Reference
In-Depth Information
where η = 1 αδ is the leakage factor and is chosen close to 1, η< 1. δ can
also be a diagonal matrix, with the entries along the diagonal chosen to weight
more heavily against unconstrained growth in regions where the input is energy
deficient [23]. Note that if δ is chosen small, there is only a little performance
degradation, but if it is too small, overflow can occur. This technique is nearly
equivalent to the technique of adding a small (of power δ ) uncorrelated noise
component to the input [23,192]. The noise is generated by one of several avail-
able techniques [180] so as to be a stationary and ergodic process with zero mean
and diagonal variance matrix with equal elements. This white noise has the effect
of exciting all the ranges of the input. For examples of these techniques, see [24],
where applications of MCA to the computation of the focus of expansion (FOE)
in passive navigation are given.
2.8.3 Preprocessing and Preconditioning
The training sets have to be preprocessed to translate the centroid of the data
to the origin of the coordinate axes (i.e., all inputs must have zero mean). This
technique is also justified by the RQ properties (2.2) and (2.3).
The exact adaptive gradient methods require explicit knowledge of the auto-
correlation matrix R
E x
) of the input vector x
x T
.Thisisthecase
of the solution of overdetermined systems where all rows of the coefficient matrix
compose the TS for the neuron. The instantaneous adaptive gradient algorithms
replace the knowledge R with its rank 1 update x
=
(
t
)
(
t
(
t
)
x T
and are controlled by
the learning rate, which also controls the memory length of the system. These
instantaneous methods cannot be used to compute the eigenpairs of a fixed deter-
ministic matrix. 14 According to [48], the gradient-based adaptive algorithms are
unattractive when not used in their approximated instantaneous form. However,
in [24] a constrained MCA EXIN algorithm, called TLS EXIN, is proposed and
works very well for finding the TLS solution of overdetermined sets of equations.
In these problems, it should also be considered a pre- and postconditioning of the
autocorrelation matrix which corresponds to the data of the overdetermined set.
(
t
)
(
t
)
2.8.4 Acceleration Techniques
The MCA learning laws are instantaneous adaptive gradient algorithms and then
work as sequential. Different methods, used in optimization theory and in neural
theory to improve the learning process, can be applied equally here. Among
these, the most interesting are:
1. Adding a momentum term [153]
2. The bold driver technique [7,185] which checks the error function to set
the learning parameters online
3. The optimal learning-rate parameter estimation using the Hessian matrix
of the error function [49]
14 As a consequence, batch techniques don't work well for MCA neurons.
Search WWH ::




Custom Search