ICA and ICAMM Methods - On Statistical Pattern Recognition in Independent Component Analysis Mixture Modelling

Information Technology Reference

In-Depth Information

commonly

assumed

that

both

the

observed

variables

and

the

independent

components have zero mean.

The source independence is expressed as the joint probability, which is the

product of the marginal densities p ð s Þ¼ Q

i

p i s ðÞ: Since the source distribution is

not available, the independence is represented in different ways, e.g., using the

following statistics

Eg i s ðÞ g j s j

¼ 0

ð 2 : 3 Þ

for any non-linear function g i ; i.e., all the cross cumulants must be zero.

Most of the existing algorithms used to estimate the matrix A can be organized

in two categories. The first category of methods directly approximates the distri-

butions of hidden sources within a specified class of distributions and minimizes a

cost function the so-called contrast function, or simply contrast, which is generi-

cally denoted / ð s Þ such as mutual information, likelihood function, or equivalents

[ 5 , 17 - 21 ]. The design of the ICA algorithms includes the formulation of a contrast

function that has to be minimized through an optimization procedure. The contrast

function is a real valued function of the estimated sources s, which yields a

minimum value when the independence is attained. The second category of

methods optimizes other contrast functions without approximating distributions

explicitly. These functions can be, for instance, nongaussianity (using neguentropy

or kurtosis), and nonlinear correlation among estimated sources [ 2 , 22 ].

In several ICA algorithms, the data are first whitened (also called sphering),

which requires the covariance matrix of the data to be unity. It is well-known that

the demixing matrix can be factorized as the product of a whitening and an

orthogonal matrix, i.e., B ¼ VW ; where V is the whitening matrix and W is the

orthogonal one. The mixtures are first whitened in order to exhaust the second

order moments (signals are forced to be uncorrelated). The whitened vector is

expressed as z ¼ VAs ; with E ¼ zz ½ ¼ I, and the whiteness constraint

E ss T ¼ I : , with s being the estimated sources. Thus, the ICA model, considering

a prewhitening step, is expressed as

s ¼ Bx ¼ WVx

ð 2 : 4 Þ

The orthogonal matrix W is a rotation of the joint density, which has to maximize

the nongaussianity of the marginal densities, thus maximizing a measure of

independence. The rotation step keeps the covariance of s equal to the identity,

thus preserving the whiteness, hence, the decorrelation of the components. Pre-

whitening is an optional step to estimate the ICA parameters; in fact, recent

methods avoid a prewhitening phase and directly attempt to compute a non-

orthogonal diagonalizing congruence (see e.g., [ 23 , 24 ]. A discussion about con-

nections between mutual information, entropy, and non Gaussianity in a general

framework without imposing whitening is presented in [ 25 ]. However, prewhi-

tening in ICA algorithms has been reported to provide algorithmic computational

advantages (see e.g., [ 26 , 27 ]).

On Statistical Pattern Recognition in Independent Component Analysis Mixture Modelling

Search WWH ::

Custom Search

Home