NONSTATIONARY STREAM DATA LEARNING WITH IMBALANCED CLASS DISTRIBUTION - Imbalanced Learning: Foundations, Algorithms, and Applications

Information Technology Reference

In-Depth Information

This chapter focuses mainly on introducing these three types of algorithms

for learning from nonstationary data streams with imbalanced class distribution.

Section 7.2 gives the preliminaries concerning different algorithms and compares

different strategies of augmenting the minority class examples in training data

chunks. Section 7.3 presents the algorithmic procedures of these algorithms and

elaborates their theoretical foundation. Section 7.4 evaluates the efficacy of these

algorithms against both real-world and synthetic benchmarks, where the type of

concept drifts, the severity of imbalanced ratio, and the level of noise are all

customizable to facilitate a comprehensive comparison. Section 7.5 concludes

the chapter and lists several potential directions that can be pursued in the future.

7.2 PRELIMINARIES

The problem of learning from nonstationary data streams with imbalanced class

distribution manifests largely in two subproblems: (i) How can imbalanced class

distributions be managed? (ii) How can concept drifts be managed? An algorithm

should thus be carefully designed in a way that the two subproblems could be

effectively solved simultaneously. This section introduces preliminary knowledge

that existing methods use to deal with these problems for a better understanding

of algorithm parts described in the next section.

7.2.1 How to Manage Imbalanced Class Distribution

Increasing the number of minority class instances within the training data chunk

to compensate the imbalanced class ratio is a natural way to improve prediction

accuracy on minority classes. In the context of learning from a static dataset,

it is recognized as an “over-sampling” method, which typically creates a set of

synthetic minority class instances based on the existing ones. SMOTE [21] is

the most well known in this category. Using SMOTE, a synthetic instance x is

created using a random segment of the line linking a minority class example x i

and one of its k -nearest minority class neighbors,

x i , that is,

x = x i + σ × ( x i − x i )

(7.1)

where σ

( 0 , 1 ) .

Treating the arrived data chunk as a static dataset, over-sampling method can

be used to augment the minority class examples therein. There is a potential flaw

though that information on the minority class carried by previous data chunks

cannot benefit the learning process on the current data chunk, which could result

in “catastrophic forgetting” [14] for learning minority class concepts. One way

to work around this is to buffer minority class examples over time and put all

of them in the current training data chunk to compensate the imbalanced class

ratio [26]. Nonetheless, in light of the concept drifts, accommodating previous

minority class examples with severely deviated target concept could undermine

∈

Search WWH ::

Custom Search

Home