Introduction - On Statistical Pattern Recognition in Independent Component Analysis Mixture Modelling

Information Technology Reference

In-Depth Information

upper to the bottom part of the tree). An agglomerative algorithm begins with

n subclusters, each of which contains a single data point, and then merges the two

most similar groups at each stage to form a new cluster, thus reducing the number of

clusters by one. The algorithm proceeds until all the data fall within a single cluster. A

divisive algorithm starts with a single cluster composed by all the given objects and

keeps splitting the clusters based on some criterion, continuing until a partition of

n singleton clusters is obtained. There is a numerical value associated with each

position in the tree where branches join (the distance or dissimilarity between two

merged clusters).

A clustering of the data items into disjoint groups is obtained by sectioning the

partition tree at a desired level. For instance, the dashed line in Fig. 1.5 at the

fourth level of merging yields a clustering of three groups. This defines a data

partitioning in a number of clusters of comparable homogeneity. The procedures

used to find the partitioning that best fits the underlying data are called cluster

validation techniques [ 39 , 40 ]. In general, there are three types of criteria for

evaluating of the resulting clustering structure: external criteria (a pre-specified

data structure that reflects our intuition about the clustering structure); internal

criteria (quantities estimated from the clusters); and relative criteria (comparing

with other structures resulting from the same algorithm but with different

parameter values). There are several indices proposed to implement the cluster

validity techniques; they are based on measuring the compactness and separation

of the clusters. Some examples of cluster validity indices are: cophenetic corre-

lation coefficient for comparison of proximity matrices (pairwise data distances) of

the resulting clustering with a known independent partition; cluster data dispersion

to pairwise cluster dissimilarity measure ratio; root-mean-square standard devia-

tion of new merged clusters; and average scattering for clusters [ 40 ]. Specifically,

we have used the partition (PC) and partition entropy (PE) coefficients used from

fuzzy clustering [ 41 ] to assess the partitioning obtained by the proposed hierar-

chical clustering algorithm of Chap. 4 .

X

N

X

nc

PC ¼ 1

N

u ij

ð 1 : 9 Þ

i ¼ 1

j ¼ 1

X

N

nc

PE ¼ 1

N

u ij log a ð u ij Þ

ð 1 : 10 Þ

i ¼ 1

j ¼ 1

where u ij denotes the degree of membership (posterior probability) of the data x i in

the j. The PC index values range from 1 = nc to 1, where nc is the number of

clusters. The closer to unity the index is, the ''crisper'' the clustering is. A PC

value close to 1 = nc indicates that there is no clustering tendency. The PE index

values range from 0 to log a ð nc Þ . The closer the value of PE to 0 is, the harder the

clustering is. As for the PC index, the values of PE close to the upper bound

ð log a ð nc ÞÞ indicate the absence of any clustering structure in the dataset or the

inability of the algorithm to extract it.

On Statistical Pattern Recognition in Independent Component Analysis Mixture Modelling

Search WWH ::

Custom Search

Home