FOUNDATIONS OF IMBALANCED LEARNING - Imbalanced Learning: Foundations, Algorithms, and Applications

Information Technology Reference

In-Depth Information

examples, while oversampling increases the time required to train a classifier

and also leads to overfitting that occurs to cover the duplicated training examples

[31, 33].

More advanced sampling methods use some intelligence when removing or

adding examples. This can minimize the drawbacks that were just described

and, in the case of intelligently adding examples, has the potential to address

the underlying issue of absolute rarity. One undersampling strategy removes

only majority class examples that are redundant with other examples or border

regions with minority class examples, figuring that they may be the result of

noise [34]. Synthetic minority oversampling technique (SMOTE), on the other

hand, oversamples the data by introducing new, non-replicated minority class

examples from the line segments that join the five minority class nearest neigh-

bors [33]. This tends to expand the decision boundaries associated with the small

disjuncts/rare cases, as opposed to the overfitting associated with random over-

sampling. Another approach is to identify a good class distribution for learning

and then generate samples with that distribution. Once this is done, multiple

training sets with the desired class distribution can be formed using all minority

class examples and a subset of the majority class examples. This can be done so

that each majority class example is guaranteed to occur in at least one training

set; so no data is wasted. The learning algorithm is then applied to each training

set and meta-learning is used to form a composite learner from the resulting clas-

sifiers. This approach can be used with any learning method and it was applied to

four different learning algorithms [1]. The same basic approach for partitioning

the data and learning multiple classifiers has also been used with support vector

machines and an support vector machine (SVM) ensemble has outperformed both

undersampling and oversampling [35].

All of these more sophisticated methods attempt to reduce some of the draw-

backs associated with the simple random sampling methods. But for the most

part, it seems unlikely that they introduce any new knowledge and hence they

do not appear to truly address any of the underlying issues previously identified.

Rather, they at best compensate for learning algorithms that are not well suited to

dealing with class imbalance. This point is made quite clearly in the description

of the SMOTE method, when it is mentioned that the introduction of the new

examples effectively serves to change the bias of the learner, forcing a more

general bias, but only for the minority class. Theoretically, such a modification

to the bias could be implemented at the algorithm level. As discussed later, there

has been research at the algorithm level in modifying the bias of a learner to

better handle imbalanced data.

The sampling methods just described are designed to reduce between-class

imbalance. Although research indicates that reducing between-class imbalance

will also tend to reduce within-class imbalances [4], it is worth considering

whether sampling methods can be used in a more direct manner to reduce

within-class imbalances — and whether this is beneficial. This question has been

studied using artificial domains and the results indicate that it is not sufficient to

eliminate between-class imbalances (i.e., rare classes) in order to learn complex

Imbalanced Learning: Foundations, Algorithms, and Applications

Search WWH ::

Custom Search

Home