Information Technology Reference
In-Depth Information
Zhu and Hovy [22] describe a bootstrap-based oversampling strategy
(BootOS) that, given an example to be resampled, generates a bootstrap example
based on all the k neighbors of that example. At each epoch, the examples
with the greatest uncertainty are selected for labeling and incorporated into a
labeled set, L .From L , the proposed oversampling strategy is applied, yielding
a more balanced dataset, L , a dataset that is used to retrain the base model.
The selection of the examples with the highest uncertainty for labeling at each
iteration involves resampling the labeled examples and training a new classifier
with the resampled dataset; therefore, scalability of this approach may be a
concern for large-scale datasets.
In the next section, we demonstrate that the principles of AL are naturally
suited to address the class imbalance problem and that AL can in fact be an
effective strategy to have a balanced view of an otherwise imbalanced dataset,
without the need to resort to resampling techniques. It is worth noting that
the goal of the next section is not to cast AL as a replacement for resam-
pling strategies. Rather, our main goal is to demonstrate how AL can allevi-
ate the issues that stem from class imbalance and present AL as an alternate
technique that should be considered in case a resampling approach is imprac-
tical, inefficient, or ineffective. In problems where resampling is the preferred
solution, we show in Section 6.4 that the benefits of AL can still be lever-
aged to address class imbalance. In particular, we present an adaptive over-
sampling technique that uses AL to determine which examples to resample
in an online setting. These two different approaches show the versatility of
AL and the importance of selective sampling to address the class imbalance
problem.
6.3 ACTIVE LEARNING FOR IMBALANCED DATA CLASSIFICATION
As outlined in Section 6.2.1, AL is primarily considered as a technique to reduce
the number of training samples that need to be labeled for a classification task.
From a traditional perspective, the active learner has access to a vast pool of
unlabeled examples, and it aims to make a clever choice to select the most
informative example to obtain its label. However, even in the cases where the
labels of training data are already available, AL can still be leveraged to obtain the
informative examples through training sets [23-25]. For example, in large-margin
classifiers such as SVM, the informativeness of an example is synonymous with
its distance to the hyperplane. The farther an example is to the hyperplane, the
more the learner is confident about its true class label; hence there is little, if any,
benefit that the learner can gain by asking for the label of that example. On the
other hand, the examples close to the hyperplane are the ones that yield the most
information to the learner. Therefore, the most commonly used AL strategy in
SVMs is to check the distance of each unlabeled example to the hyperplane and
focus on the examples that lie closest to the hyperplane, as they are considered
to be the most informative examples to the learner [8].
Search WWH ::




Custom Search