Database Reference
In-Depth Information
subgroups with respect to the target field, for instance subgroups containing only
churners or only members of a particular cluster.
They start with an initial or root node which contains the entire dataset. At
each level, the effect of all predictors on the target field is evaluated and the
population is split with the predictor that yields the best separation in regard to
the target field. The process is applied recursively and the derived groups are split
further into smaller and ''purer'' subgroups until the tree grows fully into a set of
''terminal'' nodes. In general, the tree growing procedure stops when additional
splits can no longer improve the separation or when one of the user-specified
stopping criteria has been met.
Decision trees, when used in classification modeling, generate a rule set
containing rules in the following format:
IF (PREDICTOR VALUES) THEN (PREDICTION=TARGET OUTCOME with a specific
CONFIDENCE or PROPENSITY SCORE).
For example:
IF (Tenure <= 4 yrs and Number of different products <= 2 and Recency of
last Transaction > 20 days) THEN (PREDICTION=Churner and CONFIDENCE=0.78).
When the tree is fully grown, each record lands in one of the terminal nodes.
Each terminal node comprises a distinct rule which associates the predictors to the
output. All records landing in the same terminal node get the same prediction and
the same prediction confidence (likelihood of prediction).
The path of successive splits from the root node to each terminal node indicates
the rule conditions; in other words, the combination of predictor characteristics
associated with a specific outcome. In general, if misclassification costs have not
been defined, the modal, the most frequent, outcome category of the terminal
node denotes the prediction of the respective rule. The proportion of the modal
category in each terminal node designates the confidence of the prediction. Thus,
if 78% of records in a terminal node fall within the category of non-churners,
then the respective rule's prediction would be ''no churn'' with a confidence of
78% or 0.78. Since the target field in this example is binary and simply separates
churners from non-churners, the rule's churn propensity would be (1-0.78), thus
0.22 or 22%. When the model scores new cases, all customers which satisfy the rule
conditions, that is all customers with the terminal node's profile, will be assigned a
churn propensity of 22%.
Search WWH ::




Custom Search