highest probability. The first row of Figure 8-8(a) depicts this case.

However, if we use the cost matrix in Figure 8-8(b), the
cost of pre-

dicting a Non-attriter value
is computed as the cost the business

incurs when the actual value is
Attriter
and vice versa. If this cost

matrix is applied for the same customer case, the cost of predicting

Attriter
is $30 and
Non-attriter
is $45 as shown in the second row of

Figure 8-8(a). Since we are choosing the lowest cost prediction, the

model predicts this same customer as an
Attriter
($30 < $45).

Figure 8-9 shows the apply content options for each function that

supports apply using the JDM enumerations
ClassificationApplyCon-

tent, RegressionApplyContent,
and
ClusteringApplyContent
.

For classification models, JDM defines four possible contentsâ€”

predicted category, probability, cost,
and
node ID
. The
predicted category

results in the predicted target value in the apply output, similarly

probability
and
cost
contents result in the probability or cost corre-

sponding to the predicted target value. The
node id
content is specific

to rules-based models such as decision tree that use a specific tree

node or rule for making a prediction. When node id content is speci-

fied, the node id that produced the prediction is provided in the

apply result. Node id is useful to show why a given prediction was

made.

Top

Prediction

Non-

Attriter

Predicted

Attriter

Non-Attriter

Attriter

Probability

0.30

0.7 $50

= $30

0.70

Non-Attriter

Attriter

$150 (FN)

0 (TP)

Actual

0.3 $150

= $45

Cost

Attriter

Non-Attriter

$50 (FP)

0 (TN)

(a)

(b)

Figure 8-8

Prediction Costs. (a) Computation of costs based on the (b) specified

cost matrix.

javax.datamining

Enum

ClasssificationApplyContent

predictedCategory

probability

cost

nodeld

RegressipnApplyContent

predictedValue

confidence

ClasssificationApplyContent

clusterIdentifier

probability

qualityofFit

distance

Figure 8-9

Apply contents.

