NONSTATIONARY STREAM DATA LEARNING WITH IMBALANCED CLASS DISTRIBUTION - Imbalanced Learning: Foundations, Algorithms, and Applications

Information Technology Reference

In-Depth Information

combination of all hypotheses then makes the prediction on the current dataset

under evaluation. Polikar et al. [8] adopted a similar strategy to combine hypothe-

ses in a weighted manner. The difference is the so-called “ensemble-of-ensemble”

paradigm applied to create multiple hypotheses over each training data chunk.

7.3 ALGORITHMS

On the basis of how different algorithms compensate imbalanced class ratio of

the training data chunk under consideration, this section introduces three types

of algorithms, that is, “over-sampling” algorithm, “take-in-all accommodation”

algorithm, and “selective accommodation” algorithm. The general incremental

learning scenario is that a training data chunk

(t) with labeled examples and a

S

(t) with unlabeled instances always come in a pair-wise manner at

any timestamp t . The task of algorithms at any timestamp t is to make predictions

on T

testing dataset

T

(t) as accurately as possible, based on the knowledge they have on S

t or the

( 1 ) , S

( 2 ) ,..., S

(t)

whole data stream { S

} . Without loss of generality, it is assumed

that the imbalanced class ratio of all training data chunks is the same. It would be

easily generalized to the case when the training data chunks indeed have different

imbalanced class ratios.

7.3.1 Over-Sampling Method

The most naive implementation of this method is to apply over-sampling method

to augment the minority class examples within the arrived data chunk. After that,

a standard classification algorithm is used to learn from the augmented training

data chunk. Using SMOTE as the over-sampling technique, the implementation

is shown in Algorithm 7.1. 1

As in Algorithm 7.1, SMOTE is applied to create a synthetic minority class

instance set

(t) on top of the minority class example set

(t) of each training

M

P

(t) to increase the class ratio of the

minority class data therein, which is then used to create the final hypothesis h (t)

(t) .

t

data chunk

S

M

is then appended to

S

final

(t) .

A more complex way to implement this idea is to create multiple hypothe-

ses on over-sampled training data chunks and use the weighted combination of

these hypotheses to make predictions on the dataset under evaluation. Ditzler

and Chawla [25] followed this idea by applying the Learn ++ paradigm, which is

shown in Algorithm 7.2.

The hypotheses created on over-sampled data chunks are kept in memory

over time. Whenever there is a new chunk of training data

to make predictions on the current testing dataset T

(t)

S

that arrives, the

(t) with the synthetic

algorithm first applies the SMOTE method to augment

S

(t) . Then, a base hypothesis h t is created on the

minority class instances in

M

1 /.../ represents the inline comments; the same comment will occur only once for all algorithms

listed hereafter.

Imbalanced Learning: Foundations, Algorithms, and Applications

Search WWH ::

Custom Search

Home