Information Technology Reference
In-Depth Information
6.4.2 Overview of Clustering Algorithms
We present a brief overview of some clustering algorithms, specially those
used in the comparative experiments presented later. Further details on al-
gorithms, techniques and applications, can be found in [247, 112, 113, 114, 25,
179, 225]).
From the huge variety of clustering algorithms, the hierarchical agglomer-
ative ones are probably the most used even today. Hierarchical agglomerative
algorithms create, by definition, a hierarchy of clusters from the dataset. They
start by assigning each instance to a single cluster and then, based on dissimi-
larity measures, proceed to merge small clusters into larger ones in a stepwise
manner. The process ends when all instances in the dataset are members of
a single cluster. The resulting hierarchical tree defines the clustering levels.
Examples of hierarchical clustering algorithms are CURE [85], ROCK [86],
AGNES [128], BIRCH [253, 254] and Chameleon [127].
Other clustering algorithms like those based on graphs and graph theory
were proposed in the early 1970's. They use the high connectivity in similarity
graphs to perform clustering [154,155]. These algorithms are usually divisive
algorithms, meaning that they first consider a highly connected graph (that
corresponds to a single cluster) containing all elements of the dataset and
then start doing consecutive cuts in the graph to obtain smaller clusters. A
cut in a graph corresponds to the removal of a set of edges that disconnects
the graph. A minimum cut (min-cut) is the removal of the smallest number of
edges that produces a cut. The result of a cut in the graph causes the splitting
of one cluster into, at least, two clusters. Examples of graph-based clustering
algorithms can be found in [120, 92, 93, 244]. Chameleon [127], mentioned
earlier, also uses a graph-theoretic approach.
Graph cutting is also used in spectral clustering, commonly applied in
image segmentation and, more recently, in web page and document clustering
and also in bioinformatics. The rationale of spectral clustering is to use the
special properties of the eigenvectors of a Laplacian matrix as the basis to
perform clustering. Fidler [73] was one of the first to show the application
of eigenvectors to graph partitioning. The Laplacian matrix is based on an
anity matrix built with a similarity measure. The most common similarity
measure used in spectral clustering is exp −d ij / 2 σ 2 ,where d ij is the Euclidian
distance between data vectors x i and x j and σ is a scaling parameter.
Several spectral clustering algorithms differ in the way they use the eigen-
vectors in order to perform clustering. Some researchers use the eigenvec-
tors of the "normalized" Laplacian matrix [42] (or a similar one) in or-
der to perform the cutting, usually using the second smallest eigenvec-
tor [209, 124, 55]. Others, use the highest eigenvectors as input to other
clustering algorithm [166, 157]. A comparison of several spectral clustering
algorithms can be found in [229].
Other clustering algorithms use the different density regions of the data to
perform clustering. Examples of these density-based clustering algorithms are
Search WWH ::




Custom Search