Information Technology Reference
In-Depth Information
2
2 ,
by using (2.137) and µ = y 2
( t ) + 1 w ( t )
w( t ) | EXIN w( t ) | FENG
(2.141)
by using (2.137) and µ = w ( t ) 2 . In this case, as the iterations increase,
µ λ n [see (2.31)] and then
w( t ) | FENG w( t ) | EXIN as seen in Section
2.6.3.
2.7 FLUCTUATIONS (DYNAMIC STABILITY) AND LEARNING RATE
The learning rate α ( t ) must be quite small to avoid the instability and consequent
divergence of the learning law. It implies some problems:
• A small learning rate gives a slow learning speed.
• It is difficult to find a good learning rate to prevent learning divergence.
• The transient and the accuracy in the solution are both affected by the choice
of the learning rate.
Then the study of the stochastic discrete learning laws of the three neurons
with respect to
α ( t )
becomes an analysis of their dynamics. Define:
w
) x ( t )
2
T
( t +
1
r =
(2.142)
2
2
w ( t +
)
1
w
)
2
T
(
t
)
x
(
t
r =
(2.143)
2
2
w (
t
)
r
r 0,
2
2 ,
u = y 2
ρ (α) =
p = w ( t )
( t )
The two scalars r and r represent, respectively, the squared perpendicular dis-
tance from the input x ( t ) to the data-fitting hyperplane whose normal is given by
the weight and passes through the origin, after and before the weight increment.
Recalling the definition of MC, it should hold that r r . If this inequality is
not valid, this means that the learning law increases the estimation error due to
disturbances caused by noisy data. When this increase is too large, it will make
w ( t ) deviate drastically from normal learning, which may result in divergence or
fluctuations (implying an increased learning time). This problem is here called
dynamic instability and the possible divergence is here defined as ( dynamic )
instability divergence . In the remaining part of this section we deal with the
analysis of ρ for each MCA learning law.
Search WWH ::




Custom Search