Digital Signal Processing Reference
In-Depth Information
algorithm. A very small absolute value of μ theoretically guarantees monotonic con-
vergence to a local stationary point of the contrast, but convergence may be too slow
since the update also takes very small steps in the gradient direction. To speed up
the algorithm, μ may be increased, but then the algorithm may oscillate around a lo-
cal extremum without settling down, a phenomenon known as misadjustment [ 37 ],
or even risk divergence. This speed-quality trade-off is common to both batch (op-
erating over a signal block) and adaptive (stochastic, recursive, sample-by-sample)
implementations of gradient-based optimization algorithms [ 1 , 2 , 4 ].
Many works in the literature have been devoted to making an optimal, or at least
judicious, choice of the step size, aiming at fast convergence with low misadjust-
ment. A classical approach consists in starting the iterations with a large value of μ
and then decreasing it progressively as the algorithm converges. However, detecting
whether the algorithm approaches the right solution is a problem far from trivial,
which often hinders the performance of this simple idea. Newton updates are in the-
ory parameter-free, yet, as seen earlier in the chapter for the FastICA algorithm, they
can actually reduce to constant step-size gradient iterations. A fact long unnoticed
but pointed out in [ 20 , 21 ] was recently developed in [ 71 , 74 , 76 ]: the shape of the
kurtosis contrast enables the closed-form computation of the optimal step-size value
at each extracting vector update. The resulting algorithm is described next.
9.4.5.2 Algebraic Exact Line Search: The RobustICA Algorithm
A simple approach to addressing the trade-off set by the learning coefficient in gra-
dient algorithms is exact line search. This optimization technique aims at the step
size leading to the global optimum of the contrast along the current search direction:
μ opt =
arg max
μ
J κ ( w
+
μ d )
(9.30)
where d typically represents the gradient vector at w , or any other suitably chosen
direction. In most cases, this one-dimensional optimization is costly, as it usually
requires iterative numerical methods [ 52 ]. However, when the contrast is a ratio-
nal function or a polynomial, the search for the optimal step size can be notably
simplified [ 20 , 21 ].
Although this property is satisfied by most functions based on fourth-order statis-
tics, using the full version of the kurtosis contrast ( 9.9 ) presents some attractive
advantages relative to simplified versions such as the fourth-order moment [ 74 ]:
The kurtosis is a valid source extraction contrast even if prewhitening is not
performed [ 64 ]. Avoiding prewhitening prevents the performance limitations im-
posed by this second-order processing step [ 8 ].
The kurtosis is a valid contrast in both real- and complex-valued mixture sce-
narios, so that both cases can be treated without any modification. Both types of
sources can appear simultaneously in a given mixture, and the mixing matrix en-
tries can also be real or complex. Complex sources do need not to be circularly
distributed.
Search WWH ::




Custom Search