Databases Reference
In-Depth Information
Measure
Formula
TP
C
TN
P
C
N
accuracy, recognition rate
FP
C
FN
P
C
N
error rate, misclassification rate
sensitivity, true positive rate,
TP
P
recall
TN
N
specificity, true negative rate
TP
TP
C
FP
precision
F
,
F
1
,
F
-score,
2
precision
recall
precision
C
recall
harmonic mean of precision and recall
2
.1C
/
precision
recall
F
, where
is a non-negative real number
2
precision
C
recall
Figure 8.13
Evaluation measures. Note that some measures are known by more than one name.
TP
,
TN
,
FP
,
P
,
N
refer to the number of true positive, true negative, false positive, positive,
and negative samples, respectively (see text).
buys computer
D
no
. Suppose we use our classifier on a test set of labeled tuples.
P
is the
number of positive tuples and
N
is the number of negative tuples. For each tuple, we
compare the classifier's class label prediction with the tuple's known class label.
There are four additional terms we need to know that are the “building blocks” used
in computing many evaluation measures. Understanding them will make it easy to grasp
the meaning of the various measures.
True positives
: These refer to the positive tuples that were correctly labeled by
the classifier. Let
TP
be the number of true positives.
True negatives
.
TP
/
: These are the negative tuples that were correctly labeled by the
classifier. Let
TN
be the number of true negatives.
False positives
.
TN
/
: These are the negative tuples that were incorrectly labeled as
positive (e.g., tuples of class
buys computer
D
no
for which the classifier predicted
buys computer
D
yes
). Let
FP
be the number of false positives.
False negatives
.
FP
/
: These are the positive tuples that were mislabeled as neg-
ative (e.g., tuples of class
buys computer
D
yes
for which the classifier predicted
buys computer
D
no
). Let
FN
be the number of false negatives.
.
FN
/
These terms are summarized in the
confusion matrix
of Figure 8.14.
The confusion matrix is a useful tool for analyzing how well your classifier can
recognize tuples of different classes.
TP
and
TN
tell us when the classifier is getting
things right, while
FP
and
FN
tell us when the classifier is getting things wrong (i.e.,