SUBSPACE TRACKING FOR SIGNAL PROCESSING - Adaptive Signal Processing: Next Generation Solutions

Digital Signal Processing Reference

In-Depth Information

where C x ¼WLW T represents the EVD of C x with W an orthogonal n n matrix and

L¼ Diag( l 1 , ... , l n ). This is a quite natural criterion for statistical estimation pur-

poses, even if the minimum variance property of the likelihood functional is actually

an asymptotic property. To deduce an adaptive algorithm, a gradient ascent procedure

has been proposed in [18] in which a new data x ( k ) is used at each time iteration k of the

maximization of (4.64). Using the differential of L ( W , L ) defined on the manifold of

n n orthogonal matrices [see [21, pp. 62-63] or Exercise 4.15 (4.93)], we obtain the

following gradient of L ( W , L )

r W L ¼W [ L 1 y ( k ) y T ( k ) y ( k ) y T ( k ) L 1 ]

r L L ¼L 1

þL 2 Diag[ y ( k ) y T ( k )]

def

W T x ( k ). Then, the stochastic gradient update of W yields

where y ( k ) ¼

W ( k þ 1) ¼W ( k ) þm k W ( k )[ L 1 ( k ) y ( k ) y T ( k ) y ( k ) y T ( k ) L 1 ( k )]

(4 : 65)

L ( k þ 1) ¼L ( k ) þm 0 k [ L 2 ( k )Diag[ y ( k ) y T ( k )] L 1 ( k )]

(4 : 66)

where the stepsizes m k and m 0 k are possibly different. We note that, starting from an

orthonormal matrix W (0), the sequence of estimates W ( k ) given by (4.65) is orthonor-

mal up to the second-order term in m k only. To ensure in practice the convergence of

this algorithm, is has been shown in [18] that it is necessary to orthonormalize W ( k )

quite often to compensate for the orthonormality drift in O ( m k ). Using continuous-

time system theory and differential geometry [21], a modification of (4.65) has been

proposed in [18]. It is clear that 7 W L is tangent to the curve defined by

W ( t ) ¼W (0) exp [ t ( L 1 y ( k ) y T ( k ) y ( k ) y T ( k ) L 1 )]

for t ¼ 0, where the matrix exponential is defined, for example, in [35, Chap. 11].

Furthermore, we note that this curve lies in the manifold of orthogonal matrices

if W (0) is orthogonal because exp( A ) is orthogonal if and only if A is skew-

symmetric ( A T

¼ 2 A ) and matrix L 1 y ( k ) y T ( k ) y ( k ) y T ( k ) L 1 is clearly skew-

symmetric. Moving on the curve W ( t ) from point t ¼ 0 in the direction of increasing

values of 7 W L amounts to letting t increase. Thus, a discretized version of the

optimization of L ( W , L ) as a continuous function of W is given by the following

update scheme

W ( k þ 1) ¼W ( k ) exp{ m k [ L 1 ( k ) y ( k ) y T ( k ) y ( k ) y T ( k ) L 1 ( k )]}

(4 : 67)

and the coupled update equations (4.66) and (4.67) form the MALASE algorithm. As

mentioned above the update factor exp{ m k [ L 1 ( k ) y ( k ) y T ( k ) y ( k ) y T ( k ) L 1 ( k )]} is

an orthogonal matrix. This ensures that the orthonormality property is preserved

by the MALASE algorithm, provided that the algorithm is initialized with an ortho-

gonal matrix W (0). However, it has been shown by the numerical experiments

Search WWH ::

Custom Search

Home