Classification: Basic Concepts - Data Mining: Concepts and Techniques - page 376

Databases Reference

In-Depth Information

1.0

0.8

ROC

0.6

0.4

0.2

0.0

0

0.2

0.4

False positive rate ( FPR )

0.6

0.8

1.0

Figure 8.19 ROC curve for the data in Figure 8.18.

remaining nine tuples, which are all classified as negative, five actually are negative (thus,

TN D 5). The remaining four are all actually positive, thus, FN D 4. We can therefore

compute TPR D T P D 5 D 0.2, while FPR D 0. Thus, we have the point

.

0.2, 0

/

for the

ROC curve.

Next, threshold t is set to 0.8, the probability value for tuple 2, so this tuple is now

also considered positive, while tuples 3 through 10 are considered negative. The actual

class label of tuple 2 is positive, thus now TP D 2. The rest of the row can easily be

computed, resulting in the point

. Next, we examine the class label of tuple 3 and

let t be 0.7, the probability value returned by the classifier for that tuple. Thus, tuple 3 is

considered positive, yet its actual label is negative, and so it is a false positive. Thus, TP

stays the same and FP increments so that FP D 1. The rest of the values in the row can

also be easily computed, yielding the point

.

0.4, 0

/

. The resulting ROC graph, from

examining each tuple, is the jagged line shown in Figure 8.19.

There are many methods to obtain a curve out of these points, the most common

of which is to use a convex hull. The plot also shows a diagonal line where for every

true positive of such a model, we are just as likely to encounter a false positive. For

comparison, this line represents random guessing.

.

0.4, 0.2

/

Figure 8.20 shows the ROC curves of two classification models. The diagonal line

representing random guessing is also shown. Thus, the closer the ROC curve of a model

is to the diagonal line, the less accurate the model. If the model is really good, initially

we are more likely to encounter true positives as we move down the ranked list. Thus,

Next Page

Data Mining: Concepts and Techniques

Search WWH ::

Custom Search

Home