Information Technology Reference
In-Depth Information
=
(
)
implement
Y
f
WX
, will be used for weight updation according to gradient
∝
∂
(
)/∂
ascent on joint entropy (
W
). The nonlinearity
f
in neuron should
roughly approximate the cdf of source distribution. Therefore, presented infomax
algorithm in real and complex domain proceed by maximizing the entropy of output
of a single-layered neural network.
W
h
Y
7.2.5 Feature Extraction with
R
ICA
Let
X
be the matrix of observed variables and U is its transformation given by some
matrix of coefficients W,
U
WX
. The ICA may be stated as finding the matrix
W so that the random variables
u
i
=
U
(rows of U) are as independent as possible
(finding statistically independent basis images). The goal of presented ICA algorithm
is to maximize the mutual information between the environment X and the output of
the neural network Y. This is achieved through gradient ascent on the entropy of the
output with respect to the weight matrix W. Basic steps in
R
ICA
algorithm derived
from the principle of optimal information transfer in neurons with sigmoidal transfer
functions, for feature extraction, can be summarized as follows:
1. Collect the images in data matrix X (M by N) so that images are in rows and
pixels are in column.
2. Apply
R
PCA
. The PCA basis vectors in
E
T
are analyzed for independent com-
ponents, where
E
(N by M
) be the matrix of M
eigenvectors of M images.
3. Apply whitening (sphering) of data. Final transformation matrix will be the prod-
uct of whitening matrix and optimal unmixing matrix.
4. Sources aremodeled as real randomvectors. Take sigmoidal transfer function
f
as
joint cdf of source signals, in view of optimal information transfer in real-valued
neural network.
5. Derive a contrast function (joint entropy 'h') in view of real-valued neural net-
work. Perform maximization of entropy
h
∈
(
Y
)
by neural network using gradient
ascent. If
X
is the input vector to ANN then
f
(
WX
)
is the output.
6. Find the optimal matrix W such that: MAX [
h
{
f
(
WX
)
}
], this can be done as:
•
Define a surface
h
{
f
(
U
)
}
•
Find the gradient
∇
h
with respect to W and ascent it,
W
∝∇
h
, then
W
T
−
1
X
T
M
f
(
1
M
u
i
)
=
η
+
∈
.
W
u
i
U
(7.17)
f
(
u
i
)
i
=
1
Where,
f
C
(
f
C
(
u
)/
u
)
=
(
1
−
2
f
(
u
))
.
When
h
is maximum, W is
W
OPT
. done!
The ICA algorithm learns weight matrix W, which is used to estimate a set of
independent basis images in the rows of U. Projecting the eigenvectors onto
learned weight vectors produces the independent basis images.
•
Search WWH ::
Custom Search