Segmentation Applications in Banking - Data Mining Techniques in CRM: Inside Customer Segmentation

Database Reference

In-Depth Information

and information on channel utilization, were not included in the model training

procedure since they would just confound the separation and lead the analytical

process away from the specific business goal. However, in the end all available

information of interest was taken into account during the cluster profiling and

evaluation phase.

Moreover, categorical fields were also omitted from the clustering procedure,

since, as mentioned in previous chapters, they tend to provide biased clustering

solutions which overlook differences attributable to other inputs.

THE ANALYTICAL PROCESS

The determined segmentation process comprised two steps. At first PCA was

applied to reveal the distinct data dimensions underlying the 41 inputs listed

above. Then a clustering model was used to reveal the final segmentation solution.

PCA, although optional, is a useful data preparation step aimed at data

reduction.

The extracted principal components, once explained and fully understood,

were used as clustering inputs instead of the original fields. This was the second

and final step of the analytical process: a clustering model assessed the similarities

of the records/customers in terms of the revealed components and suggested

the underlying customer groupings. The proposed clusters were then interpreted

and evaluated, mainly in terms of their business meaning and usefulness, before

concluding on the final solution adopted for the organization.

Identifying the Segmentation Dimensions with PCA/Factor Analysis

The team involved in the project selected PCA as the data reduction method. The

components extracted by PCA are uncorrelated linear combinations of the original

inputs. They are extracted in order of importance, with the first one carrying the

largest part of the variance of the original fields. The subsequent components

explain smaller portions of the total variance and are uncorrelated with each other.

Moreover, the analysts involved in the project also chose to incorporate a Varimax

rotation method in order to simplify interpretation of the components.

The PCA algorithm analyzed the inputs' intercorrelations and extracted 13

components which accounted for almost 85% of the variance/information of the

original fields - a large step toward simplicity with a minimum loss of information.

The amount of information retained by the extracted solution is summarized in

Table 6.15.

This table lists the eigenvalues and the percentage of variance (plain and

cumulative) explained by each extracted component. The criterion used to deter-

mine the number of components to extract was the eigenvalue (or latent root)

Search WWH ::

Custom Search

Home