Practical Problem Solving - Java Data Mining: Strategy, Standard, and Practice

Java Reference

In-Depth Information

Table 12-1

ROC Object—example data contents (continued)

Probability

Threshold

False Alarm

Hit Rate

True Neg.

False Neg.

True Pos.

False Pos.

0.35

0.945291

24150

639

11047

13004

0.075

0.4

0.960988

22293

455

11231

14862

0.071

0.45

0.973021

20435

315

11371

16719

0.039

0.5

0.982192

18577

208

11478

18577

0.034

0.55

0.988682

16719

132

11554

20435

0.012

0.6

0.992834

14862

83

11603

22293

0.007

0.65

0.995267

13004

55

11631

24150

0.006

0.7

0.997004

11146

35

11651

26008

0.004

0.75

0.998207

9288

20

11666

27866

0.003

0.8

0.998736

7431

14

11672

29724

0.003

0.85

0.999135

5573

10

11676

31581

0.002

0.9

0.999436

3715

6

11680

33439

0.002

0.95

0.9997

1857

3

11683

35297

0.001

1.0

0

11687

37155

0

The data points contained in this table are often depicted as the

ROC curve gains chart, which shows the true positive rate, or hit

rate , against the false positive rate, or false alarm rate , as shown in

Figure 12-2. For each point on this curve, JDM also provides all ele-

ments of the confusion matrix associated with the probability thresh-

old. In other words, the third row of Table 12-1 tells you that, if you

select all the customers with a probability higher than 0.36 (value of

the last column), it will return 7,882

3,715 “positive” customers

(the sum of the true positive and false positive cases). Here, remem-

ber that positive/negative means selected/not selected by the model,

and true/false means correctly/incorrectly classified. Table 12-1's

first row entry means that the probability threshold is so high that no

customers are selected. In the last row entry, the threshold is so low

that all customers are selected. Now, to come back to the third row,

7,882 customers were correctly classified, which means that, in our

scenario, they are buying the product proposed by HEW, and 3,715

were contacted but did not buy the product. The nice thing about all

Java Data Mining: Strategy, Standard, and Practice

Search WWH ::

Custom Search

Home