Information Technology Reference
In-Depth Information
iterative step of active learning instead of querying the entire dataset. Active
learning integrations with sampling techniques have also been proposed. For
instance, Zhu and Hovy [54] analyzed the effect of undersampling and oversam-
pling techniques with active learning for the word sense disambiguation (WSD)
imbalanced learning problem. Another active learning sampling method is the
simple active learning heuristic (SALH) approach proposed in [55]. The main
aim of this method is to provide a generic model for the evolution of genetic pro-
gramming (GP) classifiers by integrating the stochastic subsampling method and
a modified Wilcoxon-Mann-Whitney (WMW) cost function [55]. Major advan-
tages of the SALH method include the ability to actively bias the data distribution
for learning, the existence of a robust cost function, and the improvement of the
computational cost related to the fitness evaluation.
1.2.5 One-Class Learning Methods
The one-class learning or novelty detection method has also attracted much
attention in the community for imbalanced learning [4]. Generally speaking,
this category of approaches aims to recognize instances of a concept by using
mainly, or only, a single class of examples (i.e., recognition-based methodol-
ogy) rather than differentiating between instances of both positive and negative
classes as in the conventional learning approaches (i.e., discrimination-based
inductive methodology). Representative work in this area includes the one-class
SVMs [56, 57] and the autoassociator (or autoencoder) method [58-60]. For
instance, in [59], a comparison between different sampling methods and the
one-class autoassociator method was presented. The novelty detection approach
based on redundancy compression and nonredundancy differentiation techniques
was investigated in [60]. Lee and Cho [61] suggested that novelty detection
methods are particularly useful for extremely imbalanced datasets, whereas regu-
lar discrimination-based inductive classifiers are suitable for relatively moderate
imbalanced datasets.
Although the current efforts in the community are focused on two-class imbal-
anced problems, multi-class imbalanced learning problems also exist and have
been investigated in numerous works. For instance, in [62], a cost-sensitive boost-
ing algorithm AdaC2.M1 was proposed to tackle the class imbalance problem
with multiple classes. In [63], an iterative method for multi-class cost-sensitive
learning was proposed. Other works of multi-class imbalanced learning include
the min-max modular network [64] and the rescaling approach for multi-class
cost-sensitive neural networks [65], to name a few.
Our discussions in this section by no means provide a full coverage of the
complete set of methods to tackle the imbalanced learning problem, given the
variety of assumptions for the imbalanced data and different learning objectives
of different applications. Interested readers can refer to [1] for a recent survey of
the imbalanced learning methods. The latest research development on this topic
can be found in the following chapters.
Search WWH ::




Custom Search