ASSESSMENT METRICS FOR IMBALANCED LEARNING - Imbalanced Learning: Foundations, Algorithms, and Applications

Information Technology Reference

In-Depth Information

the first part, they determine the correlation between the different metrics and

their family to other metrics and families, in various contexts; in the second part,

they conduct a sensitivity analysis to study the specific performance of the group

of metrics and their families that they studied in each context. We will focus

only, herein, on the results they obtained in the context of the class imbalance

problem.

The correlation study by Ferri et al. [3] shows that while the metrics are well

correlated within each family in the balanced situation, this is not necessarily the

case in the imbalanced situation. Indeed, two observations can be made: first,

whatever correlations exist within a same family in the balanced case, these cor-

relations are much weaker in the imbalanced situation, and second, the crossovers

from one family to the next are different in the balanced and imbalanced cases.

Without going into the details, these observations (and the first one in particular)

allow us to conclude that the choice of metrics in the imbalanced case is of

particular importance.

To analyze their sensitivity analysis results, we invert the conclusions of Ferri

et al. [3], as previously mentioned, because we adopt the point of view that

sensitivity to misclassification in infrequent classes is an asset rather than a

liability. Note, however, that this inversion strategy should only be seen as a

first approximation because the study by Ferri et al. [3] plots the probability of

a wrong classification (as the class imbalance increases) no matter what class

is considered and there is no way to differentiate between the different kinds

of errors the classifiers make. Optimally, the study should be repeated from

the perspective of researchers on the class imbalance problem. That being said,

their study indicates that the rank-based measures behave best followed by some

instances of threshold metrics. The probabilistic metrics that do behave acceptably

well in the class imbalance case do so in the same spirit as some of the threshold

metrics discussed later [the multi-class focused ones (Sections 8.3.5 and 8.3.6)].

In other words, it is not their probabilistic quality that makes them behave well in

imbalanced cases, and therefore, they are not discussed any further in this chapter.

The remainder of this chapter thus focuses only on the threshold metrics and

ranking metrics most appropriate to the class imbalance problem. In the next two

sections, we will discuss in great detail the measures that present specific interest

for the class imbalance problem. We consider each class of metrics separately

and discuss the particular metrics within these classes that are well suited to the

class imbalance problem.

8.3 THRESHOLD METRICS: MULTIPLE- VERSUS SINGLE-CLASS

FOCUS

As discussed in [1], threshold metrics can have a multiple- or a single-class focus.

The multiple-class focus metrics consider the overall performance of the learning

algorithm on all the classes in the dataset. These include accuracy, error rate,

Cohen's kappa, and Fleiss' kappa measures. Precisely because of this multi-class

Search WWH ::

Custom Search

Home