Java Reference
In-Depth Information
highest probability. The first row of Figure 8-8(a) depicts this case.
However, if we use the cost matrix in Figure 8-8(b), the cost of pre-
dicting a Non-attriter value is computed as the cost the business
incurs when the actual value is Attriter and vice versa. If this cost
matrix is applied for the same customer case, the cost of predicting
Attriter is $30 and Non-attriter is $45 as shown in the second row of
Figure 8-8(a). Since we are choosing the lowest cost prediction, the
model predicts this same customer as an Attriter ($30 < $45).
Figure 8-9 shows the apply content options for each function that
supports apply using the JDM enumerations ClassificationApplyCon-
tent, RegressionApplyContent, and ClusteringApplyContent .
For classification models, JDM defines four possible contents—
predicted category, probability, cost, and node ID . The predicted category
results in the predicted target value in the apply output, similarly
probability and cost contents result in the probability or cost corre-
sponding to the predicted target value. The node id content is specific
to rules-based models such as decision tree that use a specific tree
node or rule for making a prediction. When node id content is speci-
fied, the node id that produced the prediction is provided in the
apply result. Node id is useful to show why a given prediction was
made.
Top
Prediction
Non-
Attriter
Predicted
Attriter
Non-Attriter
Attriter
Probability
0.30
0.7 $50
= $30
0.70
Non-Attriter
Attriter
$150 (FN)
0 (TP)
Actual
0.3 $150
= $45
Cost
Attriter
Non-Attriter
$50 (FP)
0 (TN)
(a)
(b)
Figure 8-8
Prediction Costs. (a) Computation of costs based on the (b) specified
cost matrix.
javax.datamining
Enum
ClasssificationApplyContent
predictedCategory
probability
cost
nodeld
RegressipnApplyContent
predictedValue
confidence
ClasssificationApplyContent
clusterIdentifier
probability
qualityofFit
distance
Figure 8-9
Apply contents.
 
Search WWH ::




Custom Search