CLASS IMBALANCE AND ACTIVE LEARNING - Imbalanced Learning: Foundations, Algorithms, and Applications

Information Technology Reference

In-Depth Information

removing the need for an offline and separate preprocessing stage. Similar to

the discussions in the previous section, VIRTUAL also employs an online SVM-

based AL strategy. In this setting, the informativeness of instances is measured

by their distance to their hyperplane, and the most informative instances are

selected as the support vectors. VIRTUAL targets the set of support vectors during

training, and resamples new instances based on this set. Since most support

vectors are found during early stages of training, corresponding virtual examples

are also created in the early stages. This prevents the algorithm from creating

excessive and redundant virtual instances, and integrating the resampling process

into the training stage improves the efficiency and generalization performance of

the learner compared to other competitive oversampling techniques.

6.4.1.1 Active Selection of Instances Let S denote the pool of real and virtual

training examples unseen by the learner at each AL step. Instead of searching

for the most informative instance among all the samples in S , VIRTUAL employs

the small-pool AL strategy that is discussed in section 6.3.2. From the small

pool, VIRTUAL selects an instance that is closest to the hyperplane according to

the current model. If the selected instance is a real positive instance (from the

original training data) and becomes a support vector, VIRTUAL advances to the

oversampling step, explained in the following section. Otherwise, the algorithm

proceeds to the next iteration to select another instance.

6.4.1.2 Virtual Instance Generation VIRTUAL oversamples the real minority

instances (instances selected from the minority class of the original training

data) that become support vectors in the current iteration. It selects the k near-

est minority class neighbors (x i → 1 ··· x i → k ) of x i based on their similarities in

the kernel-transformed higher dimensional feature space. We limit the neighbor-

ing instances of x i to the minority class so that the new virtual instances lie

within the minority class distribution. Depending on the amount of oversampling

required, the algorithm creates v virtual instances. Each virtual instance lies on

any of the line segments joining x i and its neighbor x i → j (j =

1 ,...,k) .Inother

wo rd s, a neighbor x i → j is randomly picked and the virtual instance is creat ed

as x v = λ · x i + ( 1

− λ)x i → j , where λ ∈ ( 0 , 1 ) determines the placement of x v

between x i and x i → j .All v virtual instances are added to S and are eligible to

be picked by the active learner in the subsequent iterations.

The pseudocode of VIRTUAL given in Algorithm 6.1 depicts the two processes

described previously. In the beginning, the pool S contains all real instances in

the training set. At the end of each iteration, the instance selected is removed

from S , and any virtual instances generated are included in the pool S .Inthis

pseudocode, VIRTUAL terminates when there are no instances in S .

6.4.2 Remarks on VIRTUAL

We compare VIRTUAL with a popular oversampling technique SMOTE. Figure

6.7a shows the different behaviors of how SMOTE and VIRTUAL create virtual

Imbalanced Learning: Foundations, Algorithms, and Applications

Search WWH ::

Custom Search

Home