ICA and ICAMM Methods - On Statistical Pattern Recognition in Independent Component Analysis Mixture Modelling

Information Technology Reference

In-Depth Information

non-gaussian structures recovered by a learning algorithm using Beta divergence

[ 56 ]. In addition, the automatic estimation of the number of ICA mixtures has been

approached by variational Bayesian learning [ 60 , 61 ] and on-line adaptive esti-

mation of the clusters comparing log-likelihood of the data [ 62 ]. An alternative to

the simultaneous estimation of all the ICAMM parameters is the performing of

segmented and repeated ICAs. This strategy has been recently applied for the

extraction of neural activity from large-scale optical recordings [ 63 ]. Ultimately,

computational optimization of gradient techniques used in ICAMM algorithm was

proposed applying Newton's method in the [ 64 , 60 ].

The general formulation of ICAMM is:

x t ¼ A k s k þ b k ;

k ¼ 1 ; ... ; K

ð 2 : 35 Þ

where C k denotes the class k, and each class is described by an ICA model with a

mixing matrix A k , and a bias vector b k . Essentially, b k determines the location of

the cluster and A k s k its shape. The goal of an ICA mixture model algorithm is to

determine the parameters for each class. Figure 2.3 shows the model of ICA

mixtures.

There are a few methods proposed in the ICAMM framework. They can be

grouped as follows: maximum-likelihood based, iterative-based on a distance

measure, and variational Bayesian learning methods. We include a review of three

representative ICAMM techniques: the first proposed method for unsupervised

classification and automatic context switching [ 58 ], the Beta-divergence method

[ 65 ], and a variational Bayesian method [ 53 ].

2.4.1 Unsupervised Classification Using ICAMM

In [ 58 ], an unsupervised classification maximum-likelihood-based algorithm

for modelling classes with non-gaussian densities (ICA structures) is proposed.

Þ¼ Q

The likelihood of the data is given by the joint density p X j H

p x t j H

Þ;

i ¼ 1

with

being

the

data

index

t ¼ 1 ; ... ; T .

The

mixture

density

Þ¼ Q

p x t j H

p x t j C k h k

Þ pC ð ; where H ¼ h 1 ; ... ; h k

Þ are the unknown param-

k ¼ 1

eters for each of the component densities p x j C k ; h ð Þ , and C k denotes the class

k ; k ¼ 1 ; ... ; K. The data within each class k are described by Eq. ( 2.35 ).

The log-likelihood of the data for each class is defined as

log p x t j C k ; h k

Þ ¼ log p s ðÞ log det j A k j

ð 2 : 36 Þ

and the probability for each class given the data vector x t is:

pC k j x t ; H

p x t j h k ; C k

Þ p ð C k Þ

Þ ¼

p x t j h k ; C k

Þ pC ðÞ

k ¼ 1

On Statistical Pattern Recognition in Independent Component Analysis Mixture Modelling

Search WWH ::

Custom Search

Home