Data Mining Techniques for Segmentation - Data Mining Techniques in CRM: Inside Customer Segmentation

Database Reference

In-Depth Information

subgroups with respect to the target field, for instance subgroups containing only

churners or only members of a particular cluster.

They start with an initial or root node which contains the entire dataset. At

each level, the effect of all predictors on the target field is evaluated and the

population is split with the predictor that yields the best separation in regard to

the target field. The process is applied recursively and the derived groups are split

further into smaller and ''purer'' subgroups until the tree grows fully into a set of

''terminal'' nodes. In general, the tree growing procedure stops when additional

splits can no longer improve the separation or when one of the user-specified

stopping criteria has been met.

Decision trees, when used in classification modeling, generate a rule set

containing rules in the following format:

IF (PREDICTOR VALUES) THEN (PREDICTION=TARGET OUTCOME with a specific

CONFIDENCE or PROPENSITY SCORE).

For example:

IF (Tenure <= 4 yrs and Number of different products <= 2 and Recency of

last Transaction > 20 days) THEN (PREDICTION=Churner and CONFIDENCE=0.78).

When the tree is fully grown, each record lands in one of the terminal nodes.

Each terminal node comprises a distinct rule which associates the predictors to the

output. All records landing in the same terminal node get the same prediction and

the same prediction confidence (likelihood of prediction).

The path of successive splits from the root node to each terminal node indicates

the rule conditions; in other words, the combination of predictor characteristics

associated with a specific outcome. In general, if misclassification costs have not

been defined, the modal, the most frequent, outcome category of the terminal

node denotes the prediction of the respective rule. The proportion of the modal

category in each terminal node designates the confidence of the prediction. Thus,

if 78% of records in a terminal node fall within the category of non-churners,

then the respective rule's prediction would be ''no churn'' with a confidence of

78% or 0.78. Since the target field in this example is binary and simply separates

churners from non-churners, the rule's churn propensity would be (1-0.78), thus

0.22 or 22%. When the model scores new cases, all customers which satisfy the rule

conditions, that is all customers with the terminal node's profile, will be assigned a

churn propensity of 22%.

Search WWH ::

Custom Search

Home