Database Reference
In-Depth Information
Figure 3.7 IBM SPSS Modeler recommended K-means Expert options.
Unlike classical hierarchical clustering techniques, it can handle a large
amount of records due to the initial resizing of the input records to subclusters.
As the name implies, the clustering process comprises two steps. In the first
step of pre-clustering, the entire dataset is scanned and a large number of small,
primary clusters are found. Records are assigned to the primary clusters based
on their distance. The pre-clusters are characterized by respective summary
statistics, namely the mean and variance of each numeric clustering field. Each
record is recursively guided to the closest pre-cluster; if its distance is below
an accepted threshold, it is assigned to the pre-cluster, otherwise it starts a
pre-cluster of its own. A hierarchical (agglomerative) clustering algorithm is
then applied to the pre-clusters, which are recursively merged until the final
solution.
Search WWH ::




Custom Search