ICA and ICAMM Methods - On Statistical Pattern Recognition in Independent Component Analysis Mixture Modelling

Information Technology Reference

In-Depth Information

depending on the degree to which the new incoming pattern x t belongs to the jth

cluster, which is defined as F j x ðÞ¼ log p x t j C j

: The maximum log-likelihood

value (F J max x ðÞ ) among all log-likelihood values estimated for the existing J

clusters at time t is selected. If F J max x ðÞ F ; the corresponding new incoming

pattern is added to the existing cluster with index J max ; and the parameters of this

cluster are updated properly (Fis a given negative threshold value obtained

empirically). In this case, no new cluster is generated. If F J max

x ðÞ F ; a new

cluster is generated to accommodate this new pattern.

2.4.2 b-Divergence Method Applied to ICAMM

This algorithm is based on the minimum b-divergence distance [ 56 , 65 ]. The

b-divergence between two probability density functions p ð x Þ and q ð x Þ is defined as

D b p ; ðÞ¼ Z 1

dx ;

p ð x Þ 1

b þ 1

p b ð x Þ q b ð x Þ

p b þ 1 ð x Þ q b þ 1 ð x Þ

for b [ 0

ð 2 : 39 Þ

which

non-negative

and

equal

zero

and

only

p ð x Þ¼ q ð x Þ .

The

b-divergence reduces to Kullback-Leibler divergence when b ! 0.

There exists a matrix W and a shifting parameter vector l such that the

components of s ¼ Wx l. Thus, the joint density of s can be expressed as the

product of marginal density functions q 1 ; ... ; q m by q ð s Þ¼ Q

q i s ðÞ; and the joint

i ¼ 1

Þ j det ð W Þj Q

density function of x can be expressed as r x ; W ; l

q i w i x l i

Þ;

i ¼ 1

where W i is the ith row vector of W, and l i is the ith component of l.

The algorithm explores the recovering matrix of each class in the ICA mixture

on the basis of the initial condition of a shifting parameter l. If the initial value of

the shifting parameter is close to the mean of the kth class, then the estimates for

the recovering matrix W k and the shifting parameter l k can be obtained for this

class by considering the data in other classes as outliers. Thus,

W k ; l ð Þ ; k ¼ 1 ; ... ; f g can be estimated by the repeated application of the

b-divergence method to recover all hidden classes that are sequentially based on a

rule for the step-by-step change of the shifting parameter l. In order to create a

rule for the sequential change of l, the weight function / is defined

Þ Y

p i

/ x ; W ; l

w i x-l i

ð 2 : 40 Þ

i ¼ 1

The

minimum b-divergence

method

finds

the

minimizer

the

empirical

b-divergence _

b r ; r 0 ; W ; l

Þ; where r is the empirical distribution of x ; and r 0

On Statistical Pattern Recognition in Independent Component Analysis Mixture Modelling

Search WWH ::

Custom Search

Home