Information Technology Reference
In-Depth Information
ASSESSMENT METRICS FOR
IMBALANCED LEARNING
NATHALIE JAPKOWICZ
School of Electrical Engineering and Computer Science, University of Ottawa,
Ottawa, ON, Canada and Department of Computer Science, Northern Illinois
University, Illinois, USA
Abstract: Assessing learning systems is a very important aspect of the data-mining
process. As such, many different types of metrics have been proposed over the years.
In most cases, these metrics were not designed with the class imbalance problem in
mind. As a result, some of them turn out to be appropriate for this problem, while
others are not. The purpose of this chapter is to survey existing evaluation metrics
and discuss their application to class-imbalanced domains. We survey many of the
well-known metrics that were not specifically designed to handle class imbalances
as well as more recent metrics that do take this issue into consideration.
8.1
INTRODUCTION
Evaluating learning algorithms is not a trivial issue as it requires judicious choices
of assessment metrics, error-estimation methods, and statistical tests, as well as
an understanding that the resulting evaluation can never be fully conclusive. This
is due, in part, to the inherent bias of any evaluation tool and to the frequent
violation of the assumptions they rely on [1]. In the case of class imbalances,
the problem is even more acute because the default, relatively robust procedures
used for unskewed data can break down miserably when the data is skewed.
Take, for example, the case of accuracy that measures the percentage of
times a classifier predicts the correct outcome in a testing set. This simple mea-
sure is ineffective in the presence of class imbalances as demonstrated by the
Search WWH ::




Custom Search