Information Technology Reference
In-Depth Information
W
T
−
1
ij
is the
ij
th element of the inverse of the transposed unmixing matrix W.
The gradient (
∇
)of
h
(i.e.,
∇
h
) is the matrix of derivatives in which the
ij
th element
is
∂
h
/∂
W
ij
. If we consider all the elements of W then
f
(
X
T
W
T
−
1
∂
h
(
Y
)
U
)
=
+
(7.12)
f
(
∂
W
U
)
Given the finite samples of M observed mixture values in
X
T
and a putative
unmixing matrix W, the expectation can be estimated as the mean
W
T
−
1
M
f
(
1
M
U
i
)
X
T
∇
h
=
+
(7.13)
f
(
U
i
)
i
=
1
Therefore, in order to maximize the entropy of
Y
=
f
(
U
)
, the rule for updating
W
according to gradient ascent on joint entropy (
W
∝
∂
h
(
Y
)/∂
W
) comes out in
its most general form as follows:
W
T
−
1
X
T
M
f
(
1
M
U
i
)
W
=
η
+
(7.14)
f
(
U
i
)
i
=
1
where
η
is the learning rate. One can easily drive the expression for
∇
h
for a specific
cdf of the source signals:
•
A commonly used cdf to extract source signals is the logistic function. If logistic
using Eq.
7.14
by replacing:
f
(
U
i
)
=
1
−
2
Y
(7.15)
f
(
U
i
)
•
Another important model cdf for extracting the super-gaussian (high kurtosis)
source signals is the hyperbolic tangent function. Given the cdf
f
)
then the gradient ascent learning rule for updating W could be obtained using
Eq.
7.14
by replacing:
(
U
)
=
tan
h
(
U
f
(
U
i
)
=−
2tan
h
(
U
)
(7.16)
f
(
U
i
)
The Infomax Algorithm evaluates the quality of any putative unmixing matrix
WusingEq.
7.14
through given set of observed mixtures X and corresponding set
of extracted signals U. Thus, one can deduce that, for the optimal unmixing matrix,
the signals
Y
=
(
)
have maximum entropy and therefore independent. If
f
is chosen as the model cdf of source signals then maximization of the entropy of
neural network output is equivalent to minimization of mutual information between
the individual outputs in U [
40
,
41
]. A single layer neural network set up, which
f
U
Search WWH ::
Custom Search