Hierarchical Clustering from ICA Mixtures - On Statistical Pattern Recognition in Independent Component Analysis Mixture Modelling

Information Technology Reference

In-Depth Information

s v i ¼ B ð i þ c i ; B ¼ A 1

A u ; c i ¼ A 1

T ; k 2

with

b u b v

Þ; s ¼ s 1 ðÞ; ... ; s M ðÞ

½ ; ...l 2 1 ; Q ½ : Of course, the other term in Eq. ( 4.11 ) is obtained in a

similar way, considering that now the expectations are obtained by averaging the

pdf of the sources of the other cluster.

Taking into account all the terms in Eq. ( 4.4 ), the symmetrical Kullback-Leibler

distance between clusters u, v can be computed numerically from the samples fol-

lowing the corresponding distribution

1 ; Q 1

s u i ðÞ; s u i ðÞ; ... ; s u i ðÞ

g; i ¼ 1 ; ... ; M ;

j ¼ 1 ; ... ; M (we assume that the number of samples

per source is the same for all of them). The computation is summarized as follows:

s v j ðÞ; s v j ðÞ; ... ; s v j ðÞ

Þ ¼ X

^ HS u ðÞ X

^ HS v j X

Þ X

H S v ; S u i

H S u ; S v j

D KL p x u ð x Þ== p x v ð x Þ

i ¼ 1

j ¼ 1

i ¼ 1

j ¼ 1

Þ¼ X

^ HS u ðÞ¼ 1

s u i ð n Þ s u i ð l Þ

ae 2

logp s u i

S u i ð n Þ

Þ; p s u i

S u i ð n Þ

n ¼ 1

l ¼ 1

Q X

¼ X

^ HS v j ¼ 1

; p s v j

s v j ð n Þ s v j ð l Þ

ae 2

S v j ð n Þ

logp s v j

n ¼ 1

l ¼ 1

A 1

Q M X

log X

n ¼ 1

j s u i ð n Þ

ð A v s v þ b v b u Þ

Þ¼ 1

H S v ; S u i

Nae

s v1 ¼ 1

s vM ¼ 1

A 1

j sv j ð n Þ

Q M X

log X

n ¼ 1

¼ 1

ð Au su þ bu bv Þ

Nae 2

H S u ; S v j

s u1 ¼ 1

s uM ¼ 1

ð 4 : 15 Þ

As can be observed, the similarity between clusters depends not only on the

similarity between the bias terms, but also on the similarity between the distri-

butions and the mixing matrices. The computations can also be easily extended to

the case where the number of sources in every class is not the same. In the case

that the distributions are approximated by just a single Gaussian (keeping in mind

that the ICA problem reduces to the PCA problem since there is an indetermi-

nation defined by an orthogonal matrix that is not identifiable) and the distance is

obtained analytically for the first level of the hierarchy, the distance between

two

multivariate

normal

distributions

dimension

Mp u ðÞ¼ N l u ; R u

Þ;

p v ðÞ¼ N l v ; R v

Þ would be

þ tr R v R 1

Þ ¼ tr R u R 1

D KL p u ðÞ== p v ðÞ

l u l v

R 1

Þ T

þ tr

Þ l u l v

ð 4 : 16 Þ

where tr ½ is the trace of matrix A.

Once the distances are obtained for all the clusters, the two clusters with

minimum distance are merged at a certain level. This is repeated in each step of the

hierarchy until one cluster at the level h = k is reached. To merge a cluster at level

h, we can calculate the distances from the distances of level h - 1. Suppose that

On Statistical Pattern Recognition in Independent Component Analysis Mixture Modelling

Search WWH ::

Custom Search

Home