Information Technology Reference
In-Depth Information
which demonstrates that the larger the values of i n , the smaller the step size has
to be to still guarantee stability of the algorithm. The time constant is bounded
by
1
γ (1 + N 1 i n ))
T
≤∞
,
(5.24)
ln(1
showing that a large eigenvalue spread
caused by on average high
magnitudes of i n pushes the time constant towards infinity, which results in very
slow convergence. Therefore, the convergence rate of steepest gradient descent
depends frequently on the range of the inputs 3 . This dependency is demonstrated
empirically in Sect. 5.4.
|
λ 2
λ 1 |
5.3.3
Least Mean Squared
The Least Mean Squared (LMS) algorithm is an incremental approximation to
steepest gradient descent. Rather than performing gradient descent on the error
function given all observations, it follows the gradient of the error function given
only the current observation. For this reason, it is also known as Stochastic
Incremental Steepest Gradient Descent , ADALINE , or, after their developers
Widrow and Hoff [234], the Widrow-Hoff Update .
By inspecting (5.5), the error function for the ( N + 1)th observation based on
the model after N observations is m ( x N +1 )( w N x N +1
y N +1 ) 2 , and its gradient
with respect to w N is therefore 2 m ( x N +1 ) x N +1 ( w N x N +1
y N +1 ). Using this
local gradient estimate as a surrogate for the global gradient, the LMS update
is given by
w N x N +1 ) ,
w N +1 = w N + γ N +1 m ( x N +1 ) x N +1 ( y N +1
(5.25)
starting with an arbitrary w 0 .
As the gradient estimate is only based on the current input, the method suffers
from gradient noise . Due to this noise, a constant step size γ will cause random
motion close to the optimal approximation [105, Chap. 5].
Misadjustment Due to Local Gradient Estimate
Let h N ( w )= c N
2 be the mean squared error (MSE) after N ob-
servations as a function of the weight vector. The excess mean square estimation
error is the difference between the MSE of the LMS algorithm and the minimal
MSE after (5.16). The ratio between the excess MSE and the minimal MSE error
is the misadjustment , which is a measure of how far away the convergence area
of LMS is from the optimal estimate. The estimate error for some small constant
step size can, according to [105, Chap. 5], be estimated by
X N w
y N
J
h N ( w N )+ γh N ( w N )
2
λ j ,
(5.26)
j =1
3 A similar LCS-related analysis was done by Lanzi et al. [142, 143], but there the
stability criteria for steepest gradient descent were applied to the LMS algorithm.
Search WWH ::




Custom Search