Information Technology Reference
In-Depth Information
=
(
)
implement Y
f
WX
, will be used for weight updation according to gradient
(
)/∂
ascent on joint entropy (
W ). The nonlinearity f in neuron should
roughly approximate the cdf of source distribution. Therefore, presented infomax
algorithm in real and complex domain proceed by maximizing the entropy of output
of a single-layered neural network.
W
h
Y
7.2.5 Feature Extraction with R ICA
Let X be the matrix of observed variables and U is its transformation given by some
matrix of coefficients W, U
WX . The ICA may be stated as finding the matrix
W so that the random variables u i
=
U (rows of U) are as independent as possible
(finding statistically independent basis images). The goal of presented ICA algorithm
is to maximize the mutual information between the environment X and the output of
the neural network Y. This is achieved through gradient ascent on the entropy of the
output with respect to the weight matrix W. Basic steps in R ICA algorithm derived
from the principle of optimal information transfer in neurons with sigmoidal transfer
functions, for feature extraction, can be summarized as follows:
1. Collect the images in data matrix X (M by N) so that images are in rows and
pixels are in column.
2. Apply R PCA . The PCA basis vectors in E T are analyzed for independent com-
ponents, where E (N by M ) be the matrix of M eigenvectors of M images.
3. Apply whitening (sphering) of data. Final transformation matrix will be the prod-
uct of whitening matrix and optimal unmixing matrix.
4. Sources aremodeled as real randomvectors. Take sigmoidal transfer function f as
joint cdf of source signals, in view of optimal information transfer in real-valued
neural network.
5. Derive a contrast function (joint entropy 'h') in view of real-valued neural net-
work. Perform maximization of entropy h
(
Y
)
by neural network using gradient
ascent. If X is the input vector to ANN then f
(
WX
)
is the output.
6. Find the optimal matrix W such that: MAX [ h
{
f
(
WX
) }
], this can be done as:
Define a surface h
{
f
(
U
) }
Find the gradient
h with respect to W and ascent it,
W
∝∇
h , then
W T 1
X T
M
f (
1
M
u i )
= η
+
.
W
u i
U
(7.17)
f (
u i )
i
=
1
Where, f C (
f C (
u
)/
u
) = (
1
2 f
(
u
))
.
When h is maximum, W is W OPT . done!
The ICA algorithm learns weight matrix W, which is used to estimate a set of
independent basis images in the rows of U. Projecting the eigenvectors onto
learned weight vectors produces the independent basis images.
 
Search WWH ::




Custom Search