Databases Reference
In-Depth Information
together, and pieces of the inner circle would stick together, but pieces of ring
would always be far away from the pieces of the circle. Note that if we used the
rule that the distance between clusters was the distance between their centroids,
then we might not get the intuitively correct result. The reason is that the
centroids of both clusters are in the center of the diagram.
Figure 7.13: Select representative points from each cluster, as far from one
another as possible
For the second step, we pick the representative points. If the sample from
which the clusters are constructed is large enough, we can count on a cluster's
sample points at greatest distance from one another lying on the boundary of
the cluster. Figure 7.13 suggests what our initial selection of sample points
might look like.
Finally, we move the representative points a fixed fraction of the distance
from their true location toward the centroid of the cluster. Note that in Fig. 7.13
both clusters have their centroid in the same place: the center of the inner circle.
Thus, the representative points from the circle move inside the cluster, as was
intended. Points on the outer edge of the ring also move into their cluster, but
points on the ring's inner edge move outside the cluster. The final locations of
the representative points from Fig. 7.13 are suggested by Fig. 7.14.
2
7.4.2
Completion of the CURE Algorithm
The next phase of CURE is to merge two clusters if they have a pair of rep-
resentative points, one from each cluster, that are su ciently close. The user
may pick the distance that defines “close.” This merging step can repeat, until
there are no more su ciently close clusters.
Example 7.12 : The situation of Fig. 7.14 serves as a useful illustration. There
is some argument that the ring and circle should really be merged, because their
centroids are the same. For instance, if the gap between the ring and circle were
Search WWH ::




Custom Search