Learning Mixtures of Independent Component Analysers - On Statistical Pattern Recognition in Independent Component Analysis Mixture Modelling

Information Technology Reference

In-Depth Information

3.1 The Model and the Definition of the Problem

In ICA mixture modelling, it is assumed that feature (observation) vectors x k

corresponding to a given class C k k ¼ 1... ð Þ are the result of applying a linear

transformation A k to a (source) vector s k ; whose elements are independent random

variables, plus a bias vector b k ; i.e.,

x k ¼ A k s k þ b k

k ¼ 1...K

ð 3 : 1 Þ

We assume that A k is a square matrix: feature and source vectors have the same

dimension. This is a practical assumption since original feature vectors are nor-

mally subjected to PCA and only the main (uncorrelated) components are retained

for ICA, with no further reduction in the dimension of the new feature vector that

is obtained in this way. An optimum classification of a given feature vector X of

unknown class is made by selecting the class C k that has the maximum conditional

probability pC k = x

Þ: Considering Bayes theorem, we can write:

Þ¼ p x = C k

Þ PC ðÞ

p ðÞ

p x = C k

Þ pC ðÞ

pC k = x

ð 3 : 2 Þ

p x = C k 0

Þ pC k ðÞ

k 0 ¼ 1

On the other hand, noting Eq. ( 3.1 ), if x is a feature vector corresponding to class

C k ; then [ 4 ]

p s ðÞ

Þ¼ det A 1

p x = C k

3 : 3 Þ

where s k ¼ A 1

x b k

Þ: Considering Eqs. ( 3.1 ) and ( 3.2 ), we can write

p s ð pC ðÞ

Þ¼ det A 1

pC k = x

ð 3 : 4 Þ

p s k ð pC ðÞ

det A 1

k 0

k 0 ¼ 1

In conclusion, given a feature vector x ; we should compute the corresponding

source vectors s k k ¼ 1...K ; from s k ¼ A 1

x b k

Þ to finally select the class

p s ðÞ pC ðÞ (Note that the

denominator in Eq. ( 3.4 ) does not depend on k ; so it does not influence the

maximization of pC k =ð Þ ). To make the above computation, we need to estimate

A k ; b k (to compute s k from x) and the multidimensional pdf of the source vectors

for every class (to compute p s ðÞ ). Two assumptions are considered to solve this

problem. First, that the elements of s k are independent random variables (ICA

assumption), so that the multidimensional pdf can be factored into the corre-

sponding marginal pdf's of every vector element. The second assumption is that

there is a set of independent feature vectors (learning vectors) available, which

are represented by matrix X ¼ x ð 1 Þ ...x ð N Þ

det A 1

having

the

maximum

computed

value

: We consider a hybrid situation where

the classes of a few learning vectors are known (supervised learning), while others

On Statistical Pattern Recognition in Independent Component Analysis Mixture Modelling

Search WWH ::

Custom Search

Home