Information Technology Reference
In-Depth Information
the first part, they determine the correlation between the different metrics and
their family to other metrics and families, in various contexts; in the second part,
they conduct a sensitivity analysis to study the specific performance of the group
of metrics and their families that they studied in each context. We will focus
only, herein, on the results they obtained in the context of the class imbalance
problem.
The correlation study by Ferri et al. [3] shows that while the metrics are well
correlated within each family in the balanced situation, this is not necessarily the
case in the imbalanced situation. Indeed, two observations can be made: first,
whatever correlations exist within a same family in the balanced case, these cor-
relations are much weaker in the imbalanced situation, and second, the crossovers
from one family to the next are different in the balanced and imbalanced cases.
Without going into the details, these observations (and the first one in particular)
allow us to conclude that the choice of metrics in the imbalanced case is of
particular importance.
To analyze their sensitivity analysis results, we invert the conclusions of Ferri
et al. [3], as previously mentioned, because we adopt the point of view that
sensitivity to misclassification in infrequent classes is an asset rather than a
liability. Note, however, that this inversion strategy should only be seen as a
first approximation because the study by Ferri et al. [3] plots the probability of
a wrong classification (as the class imbalance increases) no matter what class
is considered and there is no way to differentiate between the different kinds
of errors the classifiers make. Optimally, the study should be repeated from
the perspective of researchers on the class imbalance problem. That being said,
their study indicates that the rank-based measures behave best followed by some
instances of threshold metrics. The probabilistic metrics that do behave acceptably
well in the class imbalance case do so in the same spirit as some of the threshold
metrics discussed later [the multi-class focused ones (Sections 8.3.5 and 8.3.6)].
In other words, it is not their probabilistic quality that makes them behave well in
imbalanced cases, and therefore, they are not discussed any further in this chapter.
The remainder of this chapter thus focuses only on the threshold metrics and
ranking metrics most appropriate to the class imbalance problem. In the next two
sections, we will discuss in great detail the measures that present specific interest
for the class imbalance problem. We consider each class of metrics separately
and discuss the particular metrics within these classes that are well suited to the
class imbalance problem.
8.3 THRESHOLD METRICS: MULTIPLE- VERSUS SINGLE-CLASS
FOCUS
As discussed in [1], threshold metrics can have a multiple- or a single-class focus.
The multiple-class focus metrics consider the overall performance of the learning
algorithm on all the classes in the dataset. These include accuracy, error rate,
Cohen's kappa, and Fleiss' kappa measures. Precisely because of this multi-class
Search WWH ::




Custom Search