Database Reference
In-Depth Information
of simple functions on the original fields. Their purpose is to better summarize
customer behavior and convey the differentiating characteristics of each cus-
tomer. This is a critical step that depends greatly on the expertise, experience,
and ''imagination'' of the project team since the development of an informative
list of inputs can lead to richer and more refined segmentations.
The modeling data may also require transformations, specifically
standardization, so that the values and the variations of the different fields are
comparable. Clustering techniques are sensitive to possible differences in the
measurement scale of the fields. If we do not deal with these differences, the
segmentation solution will be dominated by the fields measured in larger values.
Fortunately, many clustering algorithms offer integrated standardization
methods to adjust for differences in measurement scales. Similarly, the
application of a data reduction technique like principal components analysis
(PCA) or factor analysis also provides a solution, since the generated
components or factors have standardized values.
6. Data reduction using PCA or factor analysis: The data preparation stage
is typically concluded by the application of an unsupervised data reduction
technique such as PCA or factor analysis. These techniques reduce the data
dimensionality by effectively replacing a typically large number of original
inputs with a relatively small number of compound scores, called factors or
principal components. They identify the underlying data dimensions by which
the customers will be segmented. The derived scores are then used as inputs
in the clustering model that follows. The advantages of using a data reduction
technique as a data preprocessing step include:
(a) Simplicity and conceptual clarity. The derived scores are relatively few,
interpreted, and labeled. They can be used for cluster profiling to provide
the first insight into the segments.
(b) Standardization of the clustering inputs, a feature that is important in
yielding an unbiased solution.
(c) Equal contributions from the data dimensions to the formation of the
segments.
Factor Analysis Technical Tips
PCA is the recommended technique when the primary goal
is data
reduction.
In order to simplify the explanation of the derived components, the
application of a rotation, typically Varimax, is recommended.
Search WWH ::




Custom Search