Database Reference
In-Depth Information
+
+
+
(a)
(b)
(c)
Figure 6.5 Example of clustering with different basic methods. (a) k -means. (b) Hierarchi-
cal. (c) Density-based.
that lies in the perfect center of the cluster 5 , then reassigns each object to the
centroid that is closest to it. Such iterative process stops when convergence
(perfect or approximate) is reached.
Hierarchical clustering methods (Figure 6.5 b) try to organize objects in a
multilevel structure of clusters and subclusters. The idea is that under tight
proximity requirements, several small and specific clusters might be obtained,
while loosening the requirements some clusters might be merged together into
larger and more general ones. For instance, agglomerative methods start from
a set of extremely small clusters - one singleton for each input object - and
iteratively select and merge together the pairs of clusters that are most similar.
At each iteration, then, the number of clusters decreases by one unit, and the
process ends when only one huge cluster is obtained, containing all objects.
The final output will be a data structure called dendogram , represented as a
tree where each singleton cluster is a leaf, and each cluster is a node having as
children the two subclusters that originated it through merging.
Finally, density-based clustering (Figure 6.5 c), as already introduced in Sec-
tion 6.2.1 , is aimed to form maximal, crowded (i.e., dense) groups of objects,
5 Notice that such object is a new one, computed from those in the cluster. Therefore, some level of
understanding of the data structure is needed here. When that is not possible, usually a variant is
applied, called k-medoid , that selects the most central object of the cluster as representative.
Search WWH ::




Custom Search