ASSESSMENT METRICS FOR IMBALANCED LEARNING - Imbalanced Learning: Foundations, Algorithms, and Applications

Information Technology Reference

In-Depth Information

ASSESSMENT METRICS FOR

IMBALANCED LEARNING

NATHALIE JAPKOWICZ

School of Electrical Engineering and Computer Science, University of Ottawa,

Ottawa, ON, Canada and Department of Computer Science, Northern Illinois

University, Illinois, USA

Abstract: Assessing learning systems is a very important aspect of the data-mining

process. As such, many different types of metrics have been proposed over the years.

In most cases, these metrics were not designed with the class imbalance problem in

mind. As a result, some of them turn out to be appropriate for this problem, while

others are not. The purpose of this chapter is to survey existing evaluation metrics

and discuss their application to class-imbalanced domains. We survey many of the

well-known metrics that were not specifically designed to handle class imbalances

as well as more recent metrics that do take this issue into consideration.

8.1

INTRODUCTION

Evaluating learning algorithms is not a trivial issue as it requires judicious choices

of assessment metrics, error-estimation methods, and statistical tests, as well as

an understanding that the resulting evaluation can never be fully conclusive. This

is due, in part, to the inherent bias of any evaluation tool and to the frequent

violation of the assumptions they rely on [1]. In the case of class imbalances,

the problem is even more acute because the default, relatively robust procedures

used for unskewed data can break down miserably when the data is skewed.

Take, for example, the case of accuracy that measures the percentage of

times a classifier predicts the correct outcome in a testing set. This simple mea-

sure is ineffective in the presence of class imbalances as demonstrated by the

Search WWH ::

Custom Search

Home