Information Technology Reference
In-Depth Information
This chapter focuses mainly on introducing these three types of algorithms
for learning from nonstationary data streams with imbalanced class distribution.
Section 7.2 gives the preliminaries concerning different algorithms and compares
different strategies of augmenting the minority class examples in training data
chunks. Section 7.3 presents the algorithmic procedures of these algorithms and
elaborates their theoretical foundation. Section 7.4 evaluates the efficacy of these
algorithms against both real-world and synthetic benchmarks, where the type of
concept drifts, the severity of imbalanced ratio, and the level of noise are all
customizable to facilitate a comprehensive comparison. Section 7.5 concludes
the chapter and lists several potential directions that can be pursued in the future.
7.2 PRELIMINARIES
The problem of learning from nonstationary data streams with imbalanced class
distribution manifests largely in two subproblems: (i) How can imbalanced class
distributions be managed? (ii) How can concept drifts be managed? An algorithm
should thus be carefully designed in a way that the two subproblems could be
effectively solved simultaneously. This section introduces preliminary knowledge
that existing methods use to deal with these problems for a better understanding
of algorithm parts described in the next section.
7.2.1 How to Manage Imbalanced Class Distribution
Increasing the number of minority class instances within the training data chunk
to compensate the imbalanced class ratio is a natural way to improve prediction
accuracy on minority classes. In the context of learning from a static dataset,
it is recognized as an “over-sampling” method, which typically creates a set of
synthetic minority class instances based on the existing ones. SMOTE [21] is
the most well known in this category. Using SMOTE, a synthetic instance x is
created using a random segment of the line linking a minority class example x i
and one of its k -nearest minority class neighbors,
x i , that is,
x = x i + σ × ( x i x i )
(7.1)
where σ
( 0 , 1 ) .
Treating the arrived data chunk as a static dataset, over-sampling method can
be used to augment the minority class examples therein. There is a potential flaw
though that information on the minority class carried by previous data chunks
cannot benefit the learning process on the current data chunk, which could result
in “catastrophic forgetting” [14] for learning minority class concepts. One way
to work around this is to buffer minority class examples over time and put all
of them in the current training data chunk to compensate the imbalanced class
ratio [26]. Nonetheless, in light of the concept drifts, accommodating previous
minority class examples with severely deviated target concept could undermine
Search WWH ::




Custom Search