Data Mining Techniques for Segmentation - Data Mining Techniques in CRM: Inside Customer Segmentation

Database Reference

In-Depth Information

effect of marital status on churn. The marital status field includes four

different categories: single, married, divorced, and widowed. The algorithm,

based on specific criteria, concludes that only single customers present

a different behavior with respect to churn. Thus, it regroups the marital

status and if this field is selected for splitting, it will provide a binary split

which will separate single customers from the rest. This regrouping process

simplifies the understanding of the generated model and allows analysts to

focus on true discrepancies among groups that really differ with respect to

the output.

Decision treemodels discretize continuous predictors by collapsing their

values into ordered categories before evaluating them for possible splitting.

Hence the respective fields are transformed to ordinal categorical fields, that

is, fields with ordered categories. As an example let us review the handling of

the continuous field which represents the number of SMS messages in the

telecommunications cross-selling exercise presented above. A threshold of 84

SMS messages was identified and the respective split partitioned customers

into two groups: those with more and those with less than 84 SMS messages

per month.

Developing Stable and Understandable Decision Tree Models

In the case of decision tree models, ''less is more'' and the simplicity of

the generated rules is also a factor to consider besides predictive ability.

The number of tree levels should rarely be set above five or six. In cases

where decision trees are mainly applied for profiling and an explanation of

a particular outcome, this setting should be kept even lower (by requesting

three levels for instance) in order to provide a concise and readable rule set

that will enlighten the associations between the target and the inputs.

A crucial aspect to consider in the development of decision tree models

is their stability and their ability to capture general patterns, and not patterns

pertaining only to the particular training dataset. In general, the impurity

decreases as the tree size increases. Decision trees, if not restricted with

appropriate settings, may continue to grow until they reach perfect separation,

even if they end up with terminal nodes containing only a handful of records.

But will a rule founded on the behavior of two or three records work well on

new data? Most likely not, so it is crucial that data miners should also take

into account the support of each rule, that is, the rule's coverage or ''how

many cases constitute the rule.''

Maximum purity and the respective high confidence scores are not the

only things that data miners should consider. They should avoid models that

Search WWH ::

Custom Search

Home