Digital Signal Processing Reference
In-Depth Information
6.3.5 Equivariant Adaptive Source Separation/Natural Gradient
As pointed in Chapters 3 and 4, a number of optimization algorithms in
signal processing is based on the gradient method, the main idea of which
is to explore the gradient of a given cost function to find its minimum (or
maximum). Following this procedure, the adaptation of a matrix
W
has the
general form
W
μ
∂
J
(
W
)
W
←
W
±
(6.67)
∂
W
where the sign of the update term depends on whether we are dealing with
a maximization or minimization problem, and
J
(
W
)
denotes a generic cost
function.
In [58], another approach is presented. Cardoso and Laheld employ
a serial adaptation, which consists of updating the separating matrix
according to
←
I
)
W
W
−
λ
(
y
(6.68)
where
(
·
)
maps a vector onto a matrix,
λ represents the learning step
Hence, the increment is made by left-multiplying a matrix, instead of
adding a term to the previous separating matrix.
Therefore, the adaptation rule in (6.68) suggests that we can redefine
the concept of gradient. In the standard case, the gradient at
W
can be
understood as being the first-order term of a Taylor series of
J
(
W
+
D
)
:
tr
D
T
∂
J
(
W
)
J
(
W
+
D
)
≈
J
(
W
)
+
(6.69)
∂
W
where
D
corresponds to an increment. On the other hand, the relative
gradient can be defined in a similar fashion from the expansion of
J
(
W
+
DW
)
tr
W
∂
D
T
J
(
W
)
J
(
W
+
DW
)
≈
J
(
W
)
+
∂
W
tr
D
T
∂
R
J
(
W
)
≈
J
(
W
)
+
(6.70)
∂
W