Databases Reference
In-Depth Information
2.2.1 PRECISION, RECALL AND F-SCORES
Many of the mining and summarization techniques described in this topic are supervised binary
supervised
classifiers, where supervsied means that the classifier requires training on labeled data and binary
means we are predicting one of two classes. For example, a classifier that discriminates subjective
from non-subjective comments or informative from non-informative sentences may be trained on
data where each sentence has been labeled as belonging to one of the two classes. In other words,
with all of these tasks we are trying to discern a positive class from a negative class. In these cases,
we can evaluate the classifier using precision, recall and F-score. Precision and recall are calculated
as follows:
precision
and recall
Precision
= T P /(T P + FP)
Recall
= T P /(T P + FN) ,
where TP means true positives (correctly classified as positive), FP means false positives (incorrectly
classified as positive), and FN means false negatives (incorrectly classified as negative). Note that
these two measurements share the same numerator, TP, the number of items correctly classified as
positive. To get precision, we divide the numerator by the number of items that were predicted to be
positive. To get recall, we divide the numerator by the number of items that really are positive. A
perfect classifier would have both precision and recall equal to 1, as FP and FN would be equal to 0
(i.e., no data would be incorrectly classified as positive or negative).
The F-score is simply a combination of precision and recall. The harmonic mean is typically
F-score
used, which is given by the following equation when precision and recall are weighted equally:
2
Precision
Recall
F
=
.
Precision
+
Recall
The perfect classifier would have an F-score equal to 1.
2.2.2 ROC CURVES
Many of the mining and summarization techniques described in this topic rely on probabilistic binary
classifiers, which assign to each data instance (e.g., a sentence) a posterior probability of belonging to
posterior
probability
a certain class, given the evidence, e.g., the sentence features used.
When calculating precision, recall and F-score for a probabilistic classifier, we evaluate the
classifier at a particular posterior probability threshold, where we consider a data instance to be
“positive”, i.e., to belong to the class, if the classifier's posterior probability for that particular instance
is greater than or equal to a threshold and “negative” otherwise. A commonly used threshold is 0.5,
the mid point of the [0, 1] probability range.
Arguably, a more informative alternative is to evaluate the classifier across all possible proba-
bility thresholds between 0 and 1. In practice, we can measure the true-positive/false-positive rates
as the posterior threshold is varied. The true-positive rate (TPR) and false-positive rate (FPR) are
calculated as follows:
Search WWH ::




Custom Search