Graphics Reference
In-Depth Information
approach the task of dividing a set of N objects into K groups. Trivially, for K
theonlypossiblesolutionisonebigclusterconsistingofthecompletedatasetX N .
Similarly, for K
=
N we have N clusters containing only one point each, i.e., each
point is its own cluster.
Divisive hierarchical clustering methods start with the complete data set X N and
(usually) splititintotwogroups.Eachofthesegroupsisthenrecursively dividedinto
two subgroups, and so on until each point forms a cluster of its own. Agglomerative
hierarchical clustering works the other way round, and thus starts with N single-
ton clusters. A hierarchy of clusters is created by repeatedly joining the two “closest”
clusters until the complete data set forms one cluster.
In both cases we need to be able to measure the distances d
=
between d-
dimensional data points x and y , and between groups of points. he two most com-
mon distance measures are the Euclidean distance
(
x , y
)
d
i =
d
(
x, y
)=
(
x i
y i
)
,
( . )
and the Manhattan distance
d
i =
d
(
x, y
)=
x i
y i
.
( . )
Distances between groupsof points are usedtojoin (or link)clusters,andsotheseare
oten referred toas linkage methods. Assumethat we have two sets A and B of points.
hethreesimplestlinkagemethodsformeasuringthedistance l
(
A, B
)
betweenthese
two sets are:
Single linkage: the distance between the two closest points of the clusters
l
(
A, B
)=
min
a A,b B
d
(
a, b
)
( . )
Complete linkage: the distance between the two points of the clusters that are far-
thest apart
l
(
A, B
)=
max
a A,b B
d
(
a, b
)
( . )
Average linkage: the mean distance between the points in the two clusters
l
(
A, B
)=
d
(
a, b
)
( . )
a A
b B
A
B
Numerousother linkage methods have been invented, and many of them can berep-
resented via the unifying framework given by Lance and Williams ( ).
Dendrograms
11.2.1
he results from hierarchical clustering are typically presented as a dendrogram, i.e.,
a tree where the root represents the one-cluster solution (complete data set) and the
Search WWH ::




Custom Search