Information Technology Reference
In-Depth Information
2.6.2 Divergence
Restating La Salle's principle of invariance, the Lyapunov function being weak
(see proof of Theorem 60) yields:
Proposition 66 (Flow Solution Typology) Let : M be a smooth func-
tion on a Riemannian manifold with compact sublevel sets; that is, for all c
the sublevel set { x M | ( x ) c } is a compact subset of M . Then [ 84 ] every
solution x ( t ) M of the gradient flow x ( t ) =− grad ( x ( t )) on M exists for all
t 0 .Furthermore,x ( t ) M converges to a nonempty compact and connected
component of the set of critical points of plus [ 118 ] as t →∞ .
Corollary 67 (Simple Convergence Behavior) The solutions of a gradient flow
have a particularly simple convergence behavior: There are no periodic solutions,
strange attractors, or chaotic behaviors [ 84 ] .
(i.e., to the MCA learning
laws, except FENG). Recalling the degeneracy property of the Rayleigh quotient
(see Proposition 44 for the point of view of the Lyapunov direct method), the
components of the set of critical points plus
These reasonings can be applied to
=
r
(w
, R
)
are straight lines.
In Section 2.6.1 we have shown that for MCA EXIN, OJAn, and LUO, the
weight increment at each iteration is orthogonal to the weight direction [see eq.
(2.98)] and the weight modulus always increases [see eq. (2.101)] (i.e., there is
divergence). For initial conditions of modulus greater than 1, OJA also diverges,
even if the weight increment is not orthogonal to the weight direction. Every
critical point of the ODE for the minimum straight line has a basin of attraction
given by the locus of constant modulus equal to the modulus of the critical point.
If a critical point of this line is taken as an initial condition, a small perturbation
on the weight vector given by the stochastic learning process suffices to move
the weight into another basin of larger modulus (see Figure 2.3) and therefore the
neuron will converge to another critical point of larger modulus. This happens
until the learning rate is null. Considering that s w
( t ) w ( t ) [see eq. (2.97)] is
T
T
inversely proportional to w
( t ) w ( t ) only for MCA EXIN, its weight increment
is smaller and therefore the weight will converge to a nearer critical point with
respect to OJAn and LUO. It is evident that the ODE approximation can be
accepted only outside the minimum straight line, because of its degeneracy. The
theorems (see [84, Prop. 3.6; 118, Cor. 2]) are no longer valid to infer the
asymptotic behavior of the stochastic law. The dynamic behavior of the weight
vector, except for OJA + and FENG, can be described in the following way:
1. There is an initial transient.
2. There are fluctuations around the locus of constant modulus, but with an
increasing bias; the fluctuations are a function of the learning rate, as will
be shown later.
3. Arrival is in the direction desired.
Search WWH ::




Custom Search