Database Reference
In-Depth Information
Figure 4.9 Six clusters applied to the points from Figure 4.8
4.2.5 Reasons to Choose and Cautions
K-means is a simple and straightforward method for defining clusters. Once
clusters and their associated centroids are identified, it is easy to assign new
objects (for example, new customers) to a cluster based on the object's distance
from the closest centroid. Because the method is unsupervised, using k-means
helps to eliminate subjectivity from the analysis.
Although k-means is considered an unsupervised method, there are still several
decisions that the practitioner must make:
• What object attributes should be included in the analysis?
• What unit of measure (for example, miles or kilometers) should be used
for each attribute?
• Do the attributes need to be rescaled so that one attribute does not have a
disproportionate effect on the results?
• What other considerations might apply?
Object Attributes
Regarding which object attributes (for example, age and income) to use in the
analysis, it is important to understand what attributes will be known at the time
a new object will be assigned to a cluster. For example, information on existing
customers' satisfaction or purchase frequency may be available, but such
information may not be available for potential customers.
Search WWH ::




Custom Search