Database Reference
In-Depth Information
criterion. Only components with eigenvalues above 1 were retained and this
yielded a set of 13 components, which explained 83.7% of the original information,
a percentage more than satisfactory for considering the solution as representative
of the initial fields.
However, the proportion of explained variance was not the only factor
considered before accepting the specific solution. The extracted components are
to be used as inputs to subsequent clustering models, therefore they should be
understandable and clearly associated with specific behavioral aspects. After all,
customers were about to be separated according to these newly derived composite
measures, so it was vital that they had a crystal clear business meaning.
The component interpretation phase was based on the loadings of the 13
components. They denote the correlations of the components with the original
inputs and are presented in the rotated component matrix of Table 6.16.
These loadings were examined to recognize the information conveyed by each
component and facilitate its labeling. The labeled components along with a brief
explanation of their meaning are presented in Table 6.17.
Segmenting the ''Pure Mass'' Customers with Cluster Analysis
The original fields have been temporarily left out of the segmentation procedure
and substituted by the interpreted and labeled components which were used as
inputs for the training of a clustering model. The clustering procedure included
many trials and the application of different modeling techniques and parameter
settings before concluding on the final segmentation solution to be adopted for
deployment. The accepted solution was derived by a TwoStep cluster model,
which, as mentioned in previous chapters, offers some useful features, such as the
automatic clustering procedure that proposes the ''optimal'' number of clusters
and the integrated handling of outliers, an option that can prevent distortion of the
results due to noisy records.
The data miners involved in the project did not specify in advance the number
of clusters to be created. Instead they let the algorithm propose the optimal
number, between 2 and 15 clusters. The algorithm suggested a solution of four
clusters and the next task was to fully understand the structure of each cluster
before concluding on the usefulness of the segmentation scheme revealed.
Profiling of Segments
Standard reporting techniques were applied to reveal the data patterns of the
clusters and to identify the differentiating characteristics that define them. This
profiling process provided insight into the clusters and revealed the customer
types and behaviors behind each grouping. Consequently, it also facilitated an
Search WWH ::




Custom Search