Statistical Clustering Analysis: An Introduction - Clustering Challenges in Biological Network

Biology Reference

In-Depth Information

combined since the dissimilarity of these two clusters are the minimum among all

mutual dissimilarities of remaining clusters. This procedure stops when there is

only one cluster left, i.e., all objects are clustered into one cluster.

There are several variants of the agglomerative hierarchical clustering algo-

rithm. The selection of dissimilarity measure between two clusters is one source

of the variants. Another source of the variants is the stopping criteria. Well-known

variants of hierarchical clustering methods include CURE [11], ROCK [12], and

CHEMELEON [18].

The dissimilarity measure of two clusters used in the agglomerative hierarchi-

cal clustering algorithm includes single-linkage dissimilarity, complete-linkage

dissimilarity, minimum-variance dissimilarity [30], and some others [15]. The

first two are most popularly used measures. Figure 5.4 illustrates these two mea-

sures with a 2-D example. Single-linkage dissimilarity of two clusters is the min-

imum of the distances between all pairs of objects, one from a cluster, and the

other one from another cluster. In Fig. 5.4, we use D KL to represent the dissimilar-

ity between clusters K and L , denoted as C K and C L respectively. Single-linkage

dissimilarity is calculated as follows:

D KL =min

x i ∈ C K , x j ∈ C L

d 2 ( x i , x j );

∀

i , j .

(5.17)

Contrarily, complete-linkage dissimilarity between clusters C K and C L takes

the maximum one as the dissimilarity measure, i.e.,

D KL =m x

x i ∈ C K , x j ∈ C L

d 2 ( x i , x j );

∀

i , j .

(5.18)

Fig. 5.4.

Dissimilarity measure of two clusters: (a)single-linkage; (b)complete linkage.

Now, we compare the performance of these two dissimilarity measures in hier-

archical clustering algorithms to help users to determine which measure to choose

when clustering different datasets. Jain and Goel [15] compare the performance

of these two dissimilarity measures and several other measures in agglomerative

Search WWH ::

Custom Search

Home