Information Technology Reference
In-Depth Information
focus, as also seen in the study by Ferri et al. [3], because the varying degree of
importance on the different classes is not considered, performance metrics in this
category do not fare very well in the class-imbalanced situation unless the class
ratio is specifically taken into consideration. Single-class focus metrics, on the
other hand, can be more sensitive to the issue of the varying degree of importance
placed on the different classes and, as a result, be naturally better suited to
evaluation in class-imbalanced domains. The single-class focus measures that are
discussed in this section are: sensitivity/specificity, precision/recall, Geometric
mean (G-mean), and F -measure. In addition to single-class focus metrics, we
will discuss the multi-class focus metrics that take class ratios into consideration
as a way to mitigate the contribution of the components on the overall results.
We will also present a survey of more experimental metrics that were recently
proposed but have not yet enjoyed much exposure in the community.
All the metrics discussed in this section are based on the concept of the
confusion matrix. The confusion matrix for classifier f records the number of
examples of each class that were correctly classified as belonging to that class
by classifier f , as well as the number of examples of each class that were
misclassified. For the misclassified examples, the confusion matrix considers all
kinds of misclassification possible and records the number of examples that fall in
each category. For example, if we consider a three-class problem, the following
confusion matrix tells us that a examples of class A, e examples of class B, and
i examples of class C were correctly classified by f . However, b+c examples of
class A were wrongly classified by f , b of which were mistakenly assigned to
Class B, and c of which were mistakenly assigned to class C (and similarly for
classes B and C).
Predicted class A Predicted class B Predicted class C
Actual class A
a
b
c
Actual class B
d
e
f
Actual class C
g
h
i
In the binary class case, the above-mentioned matrix is reduced to a 2 ×
2 format, and the issue of which class a misclassified example is assigned to
disappears, as there remains only one possibility. In such a case, specific names
are given to both the classes (positive and negative) and to the entries of the
confusion matrix (true positive, false negative, false positive, and true negative)
as shown in the following:
Predicted positive
Predicted negative
Actual positive
True positive (TP)
False negative (FN)
Actual negative
False positive (FP)
True negative (TN)
Search WWH ::




Custom Search