An Overview of Data Mining Techniques - Data Mining Techniques in CRM: Inside Customer Segmentation

Database Reference

In-Depth Information

revealed by analyzing the observed input data patterns. Clustering techniques

assess the similarity of the records or customers with respect to the clustering fields

and assign them to the revealed clusters accordingly. The goal is to detect groups

with internal homogeneity and interclass heterogeneity.

Clustering techniques are quite popular and their use is widespread in data

mining and market research. They can support the development of different seg-

mentation schemes according to the clustering attributes used: namely, behavioral,

attitudinal, or demographic segmentation.

The major advantage of the clustering techniques is that they can efficiently

manage a large number of attributes and create data-driven segments. The created

segments are not based on a priori personal concepts, intuitions, and perceptions of

the business people. They are induced by the observed data patterns and, provided

they are built properly, they can lead to results with real business meaning and

value. Clustering models can analyze complex input data patterns and suggest

solutions that would not otherwise be apparent. They reveal customer typologies,

enabling tailored marketing strategies. In later chapters we will have the chance to

present real-world applications from major industries such as telecommunications

and banking, whichwill highlight the true benefits of datamining-derived clustering

solutions.

Unlike classification modeling, in clustering there is no predefined set of

classes. There are no predefined categories such as churners/non-churners or

buyers/non-buyers and there is also no historical dataset with pre-classified records.

It is up to the algorithm to uncover and define the classes and assign each record

to its ''nearest'' or, in other words, its most similar cluster. To present the basic

concepts of clustering, let us consider the hypothetical case of a mobile telephony

network operator that wants to segment its customers according to their voice and

SMS usage. The available demographic data are not used as clustering inputs in

this case since the objective concerns the grouping of customers according only to

behavioral criteria.

The input dataset, for a few imaginary customers, is presented in Table 2.6.

In the scatterplot in Figure 2.10, these customers are positioned in a two-

dimensional space according to their voice usage, along the X -axis, and their SMS

usage, along the Y -axis.

The clustering procedure is depicted in Figure 2.11, where voice and SMS

usage intensity are represented by the corresponding symbols.

Examination of the scatterplot reveals specific similarities among the cus-

tomers. Customers 1 and 6 appear close together and present heavy voice usage

and low SMS usage. They can be placed in a single group which we label as ''Heavy

voice users.'' Similarly, customers 2 and 3 also appear close together but far apart

from the rest. They form a group of their own, characterized by average voice and

SMS usage. Therefore one more cluster has been disclosed, which can be labeled

as ''Typical users.'' Finally, customers 4 and 5 also seem to be different from the

Search WWH ::

Custom Search

Home