Information Technology Reference
In-Depth Information
is the average over all matched outputs, it becomes apparent that the LMS
algorithm does not achieve this optimum for arbitrary step sizes. Nonetheless,
it can be applied readily for recency-weighted applications, such as to handle
non-stationary processes, as is required in reinforcement learning applications.
5.3.4
Normalised Least Mean Squared
As seen from (5.25), the magnitude of the weight update is directly proportio-
nal to the new input vector x N +1 ,causing gradient noise amplification [105,
Chap. 6]. Thus, if some elements of the input vector are large, the correction
based on a local error will be amplified and causes additional noise. This problem
can be overcome by weighting the correction by the squared Euclidean norm of
the input, resulting in the update
x N +1
w N x N +1 ) .
w N +1 = w N + γ t m ( x N +1 )
2 ( y N +1
(5.29)
x N +1
This update equation can also be derived by calculating the weight vector update
that minimises the norm of the weight change
2 , subject to the
constraint m ( x N +1 ) w N +1 x N +1 = y N +1 . As such, the normalised LMS filter can
be seen as a solution to a constrained optimisation problem.
Regarding stability, the step size parameter γ is now weighted by the inverted
square norm of the input vector. Thus, stability in the MSE sense is dependent
on the current input. The lower bound is still 0, and the upper bound will be
generally larger than 2 if the input values are overestimated, and smaller than
2 otherwise. The optimal step size, located at the largest value of the mean
squared deviation, is the centre of the two bounds [105, Chap. 6].
As expected, the normalised LMS algorithm features a rate of convergence
that is higher than that of the standard LMS filter, as empirically demonstrated
by Douglas [76]. One drawback of the modification is that one needs to check
w N +1
w N
2 for being zero, in which case no update needs to be performed to avoid
division by zero.
To summarise, both variants of the LMS algorithm have low computational
and space costs
x N +1
( D X ), but only rely on the local gradient estimate and may
hence feature slow convergence and misadjustment. The step size can be ad-
justed to either improve convergence speed or misadjustment, but one cannot
improve both at the same time. Additionally, the speed of convergence is by
(5.21) influenced by the value of the inputs and might be severely reduced by
ill-conditioned inputs, as will be demonstrated in Sect. 5.4.
Let us recall that to quickly getting an idea of the goodness-of-fit of a classifier
model, measured by the model variance (5.13), requires a good weight vector esti-
mate. Despite their low computational cost, gradient-based methods are known
to suffer from low speed of convergence and are therefore not necessarily the best
choice for this task. The following sections describe incremental methods that
are computationally more costly, but are able to recursively track the weight
vector that satisfies (5.16) and are therefore optimal in the least squares sense.
O
 
Search WWH ::




Custom Search