Digital Signal Processing Reference
In-Depth Information
(a)
(b)
(c)
(R) = 10 , = 0. 1
(R) = 1 0 , = 1. 5
( R) = 10, = 2.05
3
3
4
2.5
2.5
3
2
2
2
1.5
1.5
1
1
1
0
0.5
0.5
−1
0
0
−1
0
1
−1
0
1
−2
−1
0
1
2
w 1 (n)
w 1 (n)
w 1 (n)
0
0
6
(R) = 1
(R) = 2
(R) = 10
(R) = 1
(R) = 2
(R) = 10
5
−50
−50
4
−100
−100
3
2
−150
−150
1
−200
−200
0
0
50
100
150
200
0
10
20
30
0
5
10
Iteration number
Iteration number
Iteration number
Fig. 3.5 Same as in Fig. 3.3 but for the Newton-Raphson algorithm
3.5 Further Comments
In [ 4 ], there is an interesting interpretation of the NR method that clarifies further
why its convergence is independent of the eigenvalue spread. The result is that the
NR algorithm works as an SD algorithm using an input signal generated by applying
the Karhunen-Loéve transform (which decorrelates the input signal) and a power
normalization procedure, which is known as a whitening process .
The SDmethod presents a very slow convergence rate in the vicinity of the optimal
solution, which is overcome by the NR method. But the latter does not take much
advantage of the high gradients at points far away from the minimum, as the SD
method does. To improve this tradeoff, the Levenberg-Marquardt (LM) algorithm
[ 5 ], comes as a combination of the SD and NR algorithms, trying to share the merits
of both methods. It basically includes a time-dependent regularization constant
)
to the NR algorithm, and instead of multiplying it by the identity matrix I L it uses
the diagonal of the Hessian matrix. This last change allows each component of the
gradient to be scaled independently to provide larger movement along the directions
where the gradient is smaller. At the beginning of the iteration process, a large
β(
n
)
is used, making the algorithm to move closer to the direction given by the gradient,
i.e., operating as an SD algorithm. As we get closer to the minimum,
β(
n
can be
decreased, bringing the algorithm closer to the NR method. In addition the use of a
regularization provides numerical stability to the NR method as mentioned earlier.
The attempts to reduce the computational cost of the NR method led to the family
quasi-Newton methods [ 6 ]. The idea behind them is to approximate the Hessian
matrix or its inverse in terms of the gradient. Some examples include the Broyden-
Fletcher-Goldfarb-Shanno method (which as the LM method performs initially as
β(
n
)
 
Search WWH ::




Custom Search