Information Technology Reference
In-Depth Information
Institute of Standards and Technology (NIST) Scientific and Technical Databases
[67]), there are very few data benchmarks that are solely dedicated to imbal-
anced learning problems. This has caused data for imbalanced learning to be
very costly in the society. For instance, many of the existing data benchmarks
require additional manipulation before they can be applied to imbalanced learn-
ing scenarios for each algorithm. This limitation has created a bottleneck in the
long-term development of research in this field. Therefore, unified data bench-
marks for imbalanced learning are important to provide an open-access source for
the community not only to promote data sharing but also to provide a common
platform to ensure a fair comparative study among different methods.
1.3.3 Standardized Assessment Metrics
As discussed in Section 1.1, traditional assessment techniques may not be able
to provide a fair and comprehensive evaluation of the imbalanced learning algo-
rithms. In particular, it is widely agreed that a singular evaluation metric, such
as overall classification error rate, is not sufficient when handling imbalanced
learning problems. As suggested in [1], it seems that a combination of singular-
based metrics (e.g., precision, recall, F -measure, and G-mean) together with
curve-based assessment metrics [e.g., receiver operating characteristic (ROC)
curve, precision-recall (PR) curve, and cost curve) will provide a more complete
assessment of imbalanced learning. Therefore, it is necessary for the community
to establish—as a standard—the practice of using such assessment approaches
to provide more insights into the advantages and limitations of different types of
imbalanced learning methods. More details on this can be found in Chapter 8.
1.3.4 Emerging Applications with Imbalanced Learning
Imbalanced learning has presented itself to be an essential part in many critical
real-world applications. For instance, in the aforementioned biomedical diagnosis
situation, an effective learning approach that could handle the imbalanced data is
key to supporting the medical decision-making process. Similar scenarios have
appeared in many other mission-critical tasks, such as security (e.g., abnormal
behavior recognition), defense (e.g., military data analysis), and financial industry
(e.g., outlier detection). This topic also presents a few examples of such critical
applications to demonstrate the importance of imbalanced learning.
1.4 ACKNOWLEDGMENTS
This work was supported in part by the National Science Foundation (NSF) under
grant ECCS 1053717 and Army Research Office (ARO) under Grant W911NF-
12-1-0378.
Search WWH ::




Custom Search