Digital Signal Processing Reference
In-Depth Information
Infomax
Some of the first ICA algorithms, such as the Bell-Sejnowksi, algorithm
were derived not from the maximum likelihood estimation principle as
shown above, but from the Infomax principle . It states that in an input-
output system, independence at the output is achieved by maximizing
the information flow that is the mutual information between inputs
and outputs. This makes sense only if some noise is introduced into the
system:
x = As + N
where N is an unknown white Gaussian random vector. One can show
that in the noiseless limit (
|
N
|→
0) Infomax corresponds to maximizing
the output entropy.
Often input-output systems are modeled using neural networks. A
single-layered neural network output function reads as
y =Φ( Bx ) ,
where Φ = ϕ 1 ×
ϕ n is a componentwise monotonously increasing nonlin-
earity and B is the weight matrix. In this case, using theorem 3.4, the
entropy can be written as
det Φ
H ( y )= H ( x )+ E (log
|
B |
)
where x is the input random vector. Then
n
E (log ϕ i ( b i x )) + log
H ( y )= H ( x )+
|
det B
|
.
i=1
Since H ( x ) is fixed, comparing this with the logarithmic likelihood func-
tion shows that Infomax directly corresponds to maximum likelihood, if
we assume that the componentwise nonlinearities are the cumulative
densities of the source components (i.e. ϕ i = p i ).
4.7
Time-Structure Based ICA
So far we have considered only mixtures of random variables having no
additional structure. In practice, this means that in each algorithm the
order of the samples was arbitrary. Of course, in reality the signals often
Search WWH ::




Custom Search