CLASS IMBALANCE AND ACTIVE LEARNING - Imbalanced Learning: Foundations, Algorithms, and Applications

Information Technology Reference

In-Depth Information

Table 6.3 Support Vectors with SMOTE (SMT), AL, and VIRTUAL

Imb.

#SV(-)/#SV(+)

# SV V (

) /#V.I.

Dataset

Rt.

SMT

VIRTUAL

SMT

VIRTUAL

Reuters

acq

3 . 7

1.24

1.28

1.18

2 . 4%

20.3%

corn

41 . 9

2.29

3.08

1.95

17 . 1%

36.6%

crude

19 . 0

2.30

2.68

2.00

10 . 8%

50.4%

earn

1 . 7

1.68

1.89

1.67

6 . 0%

24.2%

grain

16 . 9

2.62

3.06

2.32

7 . 2%

42.3%

interest

21 . 4

1.84

2.16

1.66

13 . 3%

72.2%

money-fx

13 . 4

1.86

2.17

1.34

8 . 2%

31.1%

ship

38 . 4

3.45

4.48

2.80

20 . 0%

66.5%

trade

20 . 1

1.89

2.26

1.72

15 . 4%

26.6%

wheat

35 . 7

2.55

3.43

2.22

12 . 3%

63.9%

UCI

Abalone

9 . 7

0.99

1.24

0.99

30 . 4%

69.2%

Breast

1 . 9

1.23

0.60

0.64

2 . 9%

39.5%

Letter

24 . 4

1.21

1.48

0.97

0 . 98%

74.4%

Satimage

9 . 7

1.31

1.93

0.92

37 . 3%

53.8%

Imb.Rt. is the data imbalance ratio, and #SV( − )/#SV(+) represents the support vector imbalance

ratio. The rightmost two columns compare the portion of the virtual instances selected as support

vectors in SMOTE and VIRTUAL.

methods converge to similar levels of g -means when nearly all training instances

are used, and applying an early stopping criteria would have little, if any, effect

on their training times.

Since AL involves discarding some instances from the training set, it can

be perceived as a type of under-sampling method. Unlike traditional US, which

discards majority samples randomly, AL performs an intelligent search for the

most informative ones adaptively in each iteration according to the current hyper-

plane. Datasets where class imbalance ratio is high such as corn , wheat , letter ,

and satimage observe significant decrease in PRBEP of US (Table 6.3). Note that

US's under-sampling rate for the majority class in each category is set to the same

value as the final support vector ratio where AL reaches in the early stopping

point and RS reaches when it sees the entire training data. Although the class

imbalance ratios provided to the learner in AL and US are the same, AL achieves

significantly better PRBEP performance metric than US. The Wilcoxon-signed-

rank test (two-tailed) reveals that the zero median hypothesis can be rejected at

the significance level 1% ( p = 0.0015), implying that AL performs statistically

better than US in these 18 datasets. These results reveal the importance of using

the informative instances for learning.

Table 6.2 gives the comparison of the computation times of the AL and

SMOTE. Note that SMOTE requires significantly long preprocessing time that

dominates the training time in large datasets, for example, MNIST-8 dataset. The

low computation cost, scalability, and high prediction performance of AL suggest

that AL can efficiently handle the class imbalance problem.

Imbalanced Learning: Foundations, Algorithms, and Applications

Search WWH ::

Custom Search

Home