CLASS IMBALANCE AND ACTIVE LEARNING - Imbalanced Learning: Foundations, Algorithms, and Applications

Information Technology Reference

In-Depth Information

Algorithm 6.1 VIRTUAL

Define:

X ={ x 1 ,x 2 , ··· ,x n } : training instances

X R : positive real training instances

S : pool of training instances for SVM

v : # virtual instances to create in each iteration

L : size of the small set of randomly picked samples

for active sample selection

1. Initialize S ← X

2. while S =∅

3. // Active sample selection step

4. d min ←∞

5. for i ← 1to L

6.

x j

← RandomSelect(S)

7.

If d(x j , hyperplane) < d min

8.

d min ← d(x j , hyperplane)

9.

candidate ← x j

10.

end

11.

end

12.

x s ← candidate

13.

// Virtual Instance Generation

If x s becomes SV and x s ∈ X R

14.

K ← k nearest neighbors of x s

15.

16.

1to v

17. x m ← RandomSelect(K)

18. // Create a virtual positive instance x s,m between x s and x m

19.

for i ←

λ =random number between 0 and 1

x s,m = λ · x s + ( 1

20.

− λ)x m

21. S ← S ∪ x s,m

22. end

23. end

24. S ← S − x s

25. end

instances for the minority class. SMOTE creates virtual instance(s) for each

positive example (Figure 6.7b), whereas VIRTUAL creates the majority of virtual

instances around the positive canonical hyperplane (shown with a dashed line in

Figure 6.7). Note that a large portion of virtual instances created by SMOTE is

far away from the hyperplane and thus are not likely to be selected as support

vectors. VIRTUAL, on the other hand, generates virtual instances near the real

positive support vectors adaptively in the learning process. Hence, the virtual

instances are near the hyperplane and thus are more informative.

We further analyze the computation complexity of SMOTE and VIRTUAL.

The computation complexity of VIRTUAL is O( | SV ( + ) | · v · C ) , where v is the

Imbalanced Learning: Foundations, Algorithms, and Applications

Search WWH ::

Custom Search

Home