Database Reference
In-Depth Information
7.3.6
Exercises for Section 7.3
EXERCISE 7.3.1 For the points of Fig. 7.8 , if we select three starting points using the meth-
od of Section 7.3.2 , and the first point we choose is (3,4), which other points are selected.
!! EXERCISE 7.3.2 Prove that no matter what point we start with in Fig. 7.8 , if we select
three starting points by the method of Section 7.3.2 we obtain points in each of the three
clusters. Hint : You could solve this exhaustively by begining with each of the twelve points
in turn. However, a more generally applicable solution is to consider the diameters of the
three clusters and also consider the minimum intercluster distance , that is, the minimum
distance between two points chosen from two different clusters. Can you prove a general
theorem based on these two parameters of a set of points?
! EXERCISE 7.3.3 Give an example of a dataset and a selection of k initial centroids such
that when the points are reassigned to their nearest centroid at the end, at least one of the
initial k points is reassigned to a different cluster.
EXERCISE 7.3.4 For the three clusters of Fig. 7.8 :
(a) Compute the representation of the cluster as in the BFR Algorithm. That is, compute
N , SUM, and SUMSQ.
(b) Compute the variance and standard deviation of each cluster in each of the two dimen-
sions.
EXERCISE 7.3.5 Suppose a cluster of three-dimensional points has standard deviations of 2,
3, and 5, in the three dimensions, in that order. Compute the Mahalanobis distance between
the origin (0 , 0 , 0) and the point (1 , −3 , 4).
7.4 The CURE Algorithm
We now turn to another large-scale-clustering algorithm in the point-assignment class. This
algorithm, called CURE (Clustering Using REpresentatives), assumes a Euclidean space.
However, it does not assume anything about the shape of clusters; they need not be nor-
mally distributed, and can even have strange bends, S-shapes, or even rings. Instead of rep-
resenting clusters by their centroid, it uses a collection of representative points, as the name
implies.
Search WWH ::




Custom Search