Digital Signal Processing Reference
In-Depth Information
A very attractive property of FastICA is its cubic global convergence under ideal
system conditions (noiseless infinite sample observations perfectly fulfilling the in-
stantaneous linear model) [
36
,
40
]. This desirable feature, which helps to explain
the method's success, is revisited next.
9.4.4.2 FastICA as a Constant Step-Size Gradient Algorithm
As a parameter-free technique, the Newton approach has the potential to avoid the
convergence problems that may result from an unfortunate choice of step-size in
gradient algorithms. In fact, comparing Eq. (
9.13
), (
9.20
), and (
9.23
), we realize
that the approximate Newton update can be regarded as a gradient-descent iteration
to minimize the fourth-order moment
J
f
(
q
)
(Eq. (
9.19
)). Moreover, this implicit
gradient iteration uses a fixed step size
μ
1
/
12.
Although this specific value for the step-size parameter may seem arbitrary, it is
actually instrumental in endowing FastICA with the desirable cubic global conver-
gence property under ideal system conditions. To prove this claim, let us consider
an update of the form (
9.23
) with a generic but otherwise constant step size, say
ν
,
rather than the value
=−
−
1
/
3 used in FastICA's iteration. First recall that the global fil-
Q
T
q
links the extractor output with the sources as
y
g
T
s
, where
=
=
=
ter
g
g
1,
since
q
=
1 due to prewhitening (Sect.
9.4.4.1
). In terms of
g
, the resulting update
would read:
ν
E
y
3
s
=
I
ν
E
g
T
s
2
ss
T
g
.
g
+
=
g
+
+
(9.25)
y
3
s
y
2
s
(
s
T
g
)
y
2
ss
T
The second equality stems from the fact that E
g
.
The crucial step to prove FastICA's cubic global convergence is showing that the
updates induced in the entries of
g
by the above equation are uncoupled from each
other and have a cubic-only dependence, i.e., that
{
}=
E
{
}=
E
{
}
g
i
=
α
i
g
i
(9.26)
for all entries
i
1
,
2
,...,N
of vector
g
, with at least one of the coefficients
α
i
different from zero. Focusing on a generic entry
g
i
,Eq.(
9.25
) yields:
=
N
g
i
=
g
i
+
ν
h
ij
g
j
(9.27)
j
=
1
y
2
s
i
s
j
}
y
2
ss
T
where
h
ij
E
{
is the
(i,j)
th element of matrix E
{
}
, which can be
g
T
E
s
i
s
j
ss
T
expressed as the quadratic form
h
ij
=
g
. Under the source statistical
independence assumption
A2
, we can easily show that
{
}
k
=
i
g
k
+
3
)g
i
,i
(κ
i
+
=
j,
h
ij
=
(9.28)
2
g
i
g
j
,
i
=
j,
Search WWH ::
Custom Search