Hierarchical Clustering from ICA Mixtures - On Statistical Pattern Recognition in Independent Component Analysis Mixture Modelling

Information Technology Reference

In-Depth Information

Table 4.1

ICAMM-based hierarchical clustering algorithm

Estimate the ICA parameters for each of the clusters at the bottom of the hierarchy at level

h ¼ 1

A k ; s k ; b k ; k ¼ 1 ; ...K ;

use the Mixca algorithm in Chap. 3 .

Estimate pairwise distance for every pair of clusters

x 1 ; x 2 ; ... ; x N ; use Eq. ( 4.15 ) to estimate ^ Hs u ð ; ^ Hs v j ; ^ Hs v ; s u i

and

Þ; ^ Hs u ; s v j

non-parametric kernel-based estimation of the source pdf.

Select the pair of clusters with minimum distance to merge.

Estimate the density for the merged cluster h ¼ h þ 1

p h x/C w

p h 1 x/C h 1

þ p h 1 C h 1

p h 1 x/C h 1

¼ p h 1 C h 1

þ p h 1 C h 1

;

p h 1 C h 1

; p h 1 C h 1

are the priors of clusters u,v at level h 1.

where p h 1 C h 1

Calculate the distance from a cluster C z to a merged cluster C w

C ðÞ¼ I kF K 1 ; ... ; K m

Þ;

Þ¼ p h x/C k

Þ p h C ðÞ

use arg max

C k

p h

C k = x

to assign a new data to a cluster in the

K h þ 1

p h x/C i

Þ p h C ðÞ

i ¼ 1

hierarchy

Repeat steps 2-4 until one cluster is obtained at the top of the hierarchy

We present simulation results that illustrate the properties of the hierarchical

algorithm proposed. We start by a simple example that illustrates how the algo-

rithm can be used to estimate hierarchical mixtures. Figure 4.2 shows ten different

ICA mixtures generated from sources with Laplacian (L) and uniform (U) dis-

tributions: 1. (U), 2. r (U), 3. D (L), 4. o (L), 5. [ (U), 6. ? (U), 7. \ (L), 8. h

(L), 9. * (U), 10. 9 (L). The hierarchical description of the data is shown at three

resolutions h ¼ 1 ; 5 ; ð Þ: Notice how the mixture hierarchy naturally captures the

various levels of structure exhibited by the data. Figure 4.2 a shows an example of

how the algorithm first learned the basis vectors and bias terms for each class at

level h = 1 and then used these estimations to create a hierarchical tree of the

clusters. Figure 4.2 b shows the merged clusters at level h = 5, where six clusters

remain. The closest clusters are merged. In Fig. 4.2 c, there are only two clusters

remaining. Note how not only the relative bias between clusters is used to merge

them, but also similar densities are merged together. Figure 4.3 shows the resulting

dendrogram with the distances at which the clusters have been merged. The higher

the level of the hierarchy, the larger the distance required to merging the clusters,

and the independence assumption of the bottom level becomes weaker.

In order to obtain a good performance of the agglomerative clustering, the

bandwidth parameter (h in Eq. ( 4.17 )) has to be adjusted in order to keep

the distances from being thresholded to one value. There is a compromise between

the resolution of the distances and their range. Thus, we chose a high bandwidth.

In order to determine a quality criterion that allows the optimum hierarchical

PC ¼ N P

u ij

level to be determined, the partition

and partition entropy

i ¼ 1

j ¼ 1

On Statistical Pattern Recognition in Independent Component Analysis Mixture Modelling

Search WWH ::

Custom Search

Home