Database Reference
In-Depth Information
Decision rules: These are quite similar to decision trees and produce a list of
rules which have the format of human-understandable statements:
IF (PREDICTOR VALUES) THEN (TARGET OUTCOME AND CONFIDENCE SCORE).
Their main difference from decision trees is that they may produce multiple
rules for each record. Decision trees generate exhaustive and mutually exclusive
rules which cover all records. For each record only one rule applies. On the
contrary, decision rules may generate an overlapping set of rules. More than
one rule, with different predictions, may hold true for each record. In that case,
rules are evaluated, through an integrated procedure, to determine the one for
scoring. Usually a voting procedure is applied, which combines the individual
rules and averages their confidences for each output category. Finally, the
category with the highest average confidence is selected as the prediction.
Decision rule algorithms include:
-C5.0
- Decision list.
Logistic regression: This is a powerful and well-established statistical
technique that estimates the probabilities of the target categories. It is
analogous to simple linear regression but for categorical outcomes. It uses the
generalized linear model and calculates regression coefficients that represent
the effect of predictors on the probabilities of the categories of the target field.
Logistic regression results are in the form of continuous functions that estimate
the probability of membership in each target outcome:
ln( p
/
(1
p ))
=
b 0
+
b 1
·
Predictor 1
+
b 2
·
Predictor 2
+ ···+
b n
·
Predictor N
where p
probability of an event to happen.
For example:
=
ln(churn probability
/
(no churn probability))
=
b 0
+
b 1
·
Tenure
+
b 2
·
Number of products
+··· .
In order to yield optimal results it may require special data preparation,
including potential screening and transformation of the predictors. It still
demands some statistical experience, but provided it is built properly it can
produce stable and understandable results.
Neural networks: Neural networks are powerful machine learning algorithms
that use complex, nonlinear mapping functions for estimation and classification.
Search WWH ::




Custom Search