Hierarchical Clustering from ICA Mixtures - On Statistical Pattern Recognition in Independent Component Analysis Mixture Modelling

Information Technology Reference

In-Depth Information

Chapter 4

Hierarchical Clustering from ICA

Mixtures

4.1 Introduction

In this chapter, we present a procedure for clustering (unsupervised learning) data

from a model based on mixtures of independent component analyzers. Clustering

techniques have been extensively studied in many different fields for a long time.

They can be organized in different ways according to several theoretical criteria.

However, a rough widely accepted classification of these techniques is: hierar-

chical and partitional clustering; see for instance [ 1 ]. Both clustering categories

provide a division of the data objects. The hierarchical approach also yields a

hierarchical structure from a sequence of partitions performed from singleton

clusters to a cluster including all data objects (agglomerative or bottom-up strat-

egy) or vice versa (divisive or top-down strategy). This structure consists of a

binary tree (dendrogram) whose leaves are the data objects and whose internal

nodes represent nested clusters of various sizes. The whole node of the dendro-

gram represents the whole data set. The internal nodes describe the extent that the

objects are proximal to each other; and the height of the dendrogram usually

represents the distance between each pair of objects or clusters, or an object and a

cluster.

A review of the clustering algorithms should include the following types of

algorithms: hierarchical; squared error-based (vector quantization); mixture density-

based; graph theory-based; combinatorial search technique-based; fuzzy; neural

network-based; and kernel-based. In addition, some techniques have been developed

to tackle sequential, large-scale, and high-dimensional data sets [ 2 ]. The advantages

of hierarchical clustering include embedded flexibility regarding the level of

granularity and the ability to deal with different types of attributes. The disadvantages

of hierarchical clustering are the difficulty of scaling up to large data sets, the

vagueness of stopping criteria, and the fact that most clustering algorithms cannot

recover from poor choices when merging or splitting data points [ 3 ].

Search WWH ::

Custom Search

Home