ICA and ICAMM Methods - On Statistical Pattern Recognition in Independent Component Analysis Mixture Modelling

Information Technology Reference

In-Depth Information

The algorithms used in ICA can be deterministic or stochastic. The deterministic

algorithms always produce the same results (usually exploiting the algebraic

structure of the matrices involved) whereas the stochastic algorithms are adaptive

starting from a random unmixing matrix that is updated iteratively. The updating

can be made for every observation (on-line) or for the whole set of observations

(off-line). Thus, the results of stochastic algorithms vary in different executions of

the algorithm. The reliability of the results has to be studied since the algorithm

may reach a local optimum (local consistency) instead of the unique global

optimum (global consistency) of the contrast function. The convergence depends

on statistical variables such as random sampling of the data. It is commonly

accepted that the estimation results are robust to the details of knowledge about the

distributions (super- or sub-gaussianity, and so on). It has also been demonstrated

that incorrect assumptions on such distributions can result in poor estimation

performance, and sometimes in a complete failure to obtain the source separation

[ 28 ]. Local consistency of ICA methods that search for specified distributions and

global consistency in the case of two sources with heavy-tail distributions has been

studied [ 19 , 26 , 29 ]. Recently, the statistical reliability or ''quality'' of the

parameters estimated by ICA has been analyzed using bootstrap resampling

techniques and visualization of the cluster structure of the components [ 30 , 31 ].

2.2 Standard ICA Methods

The ideal measure of independence is the ''mutual information'' that was proposed

as a contrast function in [ 17 ]. It has been demonstrated that this function corre-

sponds to the likelihood for a model of independent components that is optimized

with respect to all its parameters. Thus, the likelihood in a given ICA model is the

probability of a data set as a function of the mixing matrix and the component

distributions [ 28 ]. Mutual information ð I Þ is defined as the Kullback-Leibler ð KL Þ

divergence or relative entropy between the joint density and the product of the

marginal distributions:

!

¼ Z p ð s Þ log

I ð s Þ¼ KL s; Y

i

p ð s Þ

Q

p ð s i Þ

p ð s i Þ ds

ð 2 : 5 Þ

i

It is non-negative and equals to zero only if the distributions are the same. The

logarithm of the fraction in Eq. ( 2.5 ) can be transformed into a difference of

logarithms, obtaining

I ð s Þ¼ X

i

H ð s i Þ H ð s Þ

ð 2 : 6 Þ

where H ð u Þ denotes Shannon's differential entropy for a continuous random

variable u ; which can be seen as a measure of the randomness of the variable u.

On Statistical Pattern Recognition in Independent Component Analysis Mixture Modelling

Search WWH ::

Custom Search

Home