Database Reference
In-Depth Information
scheme throughout the enterprise. It involves recognition of the differentiating
characteristics of each cluster through profiling with the use of descriptive statistics
and charts.
The cluster profiling typically involves:
1. Comparison of clusters with respect to the clustering fields - examining
the cluster centers: The profiling phase usually starts with the cross-tabulation
of the cluster membership field with the clustering inputs which are also
referred to as the clustering fields. The goal is to identify the input data
patterns that distinguish each cluster. Derived clusters will be interpreted and
labeled according to their differentiating characteristics and consequently the
profiling phase inevitably includes going back to the inputs and determining
the uniqueness of each cluster with respect to the clustering fields.
Typically, data miners start the profiling process with an examination of
the cluster centers or seeds, also referred to as the cluster centroids. A cluster
centroid is defined by simply taking the input fields and averaging over all the
records of a cluster. The centroid can be thought of as the prototype or the most
typical representative member of a cluster. Cluster centers for the simple case
of two clustering fields and two derived clusters are depicted in Figure 3.13.
Figure 3.13 Graphical representation of the cluster centers or centroids.
Search WWH ::




Custom Search