Information Technology Reference
In-Depth Information
combination of all hypotheses then makes the prediction on the current dataset
under evaluation. Polikar et al. [8] adopted a similar strategy to combine hypothe-
ses in a weighted manner. The difference is the so-called “ensemble-of-ensemble”
paradigm applied to create multiple hypotheses over each training data chunk.
7.3 ALGORITHMS
On the basis of how different algorithms compensate imbalanced class ratio of
the training data chunk under consideration, this section introduces three types
of algorithms, that is, “over-sampling” algorithm, “take-in-all accommodation”
algorithm, and “selective accommodation” algorithm. The general incremental
learning scenario is that a training data chunk
(t) with labeled examples and a
S
(t) with unlabeled instances always come in a pair-wise manner at
any timestamp t . The task of algorithms at any timestamp t is to make predictions
on T
testing dataset
T
(t) as accurately as possible, based on the knowledge they have on S
t or the
( 1 ) , S
( 2 ) ,..., S
(t)
whole data stream { S
} . Without loss of generality, it is assumed
that the imbalanced class ratio of all training data chunks is the same. It would be
easily generalized to the case when the training data chunks indeed have different
imbalanced class ratios.
7.3.1 Over-Sampling Method
The most naive implementation of this method is to apply over-sampling method
to augment the minority class examples within the arrived data chunk. After that,
a standard classification algorithm is used to learn from the augmented training
data chunk. Using SMOTE as the over-sampling technique, the implementation
is shown in Algorithm 7.1. 1
As in Algorithm 7.1, SMOTE is applied to create a synthetic minority class
instance set
(t) on top of the minority class example set
(t) of each training
M
P
(t) to increase the class ratio of the
minority class data therein, which is then used to create the final hypothesis h (t)
(t) .
t
data chunk
S
M
is then appended to
S
final
(t) .
A more complex way to implement this idea is to create multiple hypothe-
ses on over-sampled training data chunks and use the weighted combination of
these hypotheses to make predictions on the dataset under evaluation. Ditzler
and Chawla [25] followed this idea by applying the Learn ++ paradigm, which is
shown in Algorithm 7.2.
The hypotheses created on over-sampled data chunks are kept in memory
over time. Whenever there is a new chunk of training data
to make predictions on the current testing dataset T
(t)
S
that arrives, the
(t) with the synthetic
algorithm first applies the SMOTE method to augment
S
(t) . Then, a base hypothesis h t is created on the
minority class instances in
M
1 /.../ represents the inline comments; the same comment will occur only once for all algorithms
listed hereafter.
Search WWH ::




Custom Search