Information Technology Reference
In-Depth Information
Table 6.3 Support Vectors with SMOTE (SMT), AL, and VIRTUAL
Imb.
#SV(-)/#SV(+)
# SV V (
+
) /#V.I.
Dataset
Rt.
SMT
AL
VIRTUAL
SMT
VIRTUAL
Reuters
acq
3 . 7
1.24
1.28
1.18
2 . 4%
20.3%
corn
41 . 9
2.29
3.08
1.95
17 . 1%
36.6%
crude
19 . 0
2.30
2.68
2.00
10 . 8%
50.4%
earn
1 . 7
1.68
1.89
1.67
6 . 0%
24.2%
grain
16 . 9
2.62
3.06
2.32
7 . 2%
42.3%
interest
21 . 4
1.84
2.16
1.66
13 . 3%
72.2%
money-fx
13 . 4
1.86
2.17
1.34
8 . 2%
31.1%
ship
38 . 4
3.45
4.48
2.80
20 . 0%
66.5%
trade
20 . 1
1.89
2.26
1.72
15 . 4%
26.6%
wheat
35 . 7
2.55
3.43
2.22
12 . 3%
63.9%
UCI
Abalone
9 . 7
0.99
1.24
0.99
30 . 4%
69.2%
Breast
1 . 9
1.23
0.60
0.64
2 . 9%
39.5%
Letter
24 . 4
1.21
1.48
0.97
0 . 98%
74.4%
Satimage
9 . 7
1.31
1.93
0.92
37 . 3%
53.8%
Imb.Rt. is the data imbalance ratio, and #SV( )/#SV(+) represents the support vector imbalance
ratio. The rightmost two columns compare the portion of the virtual instances selected as support
vectors in SMOTE and VIRTUAL.
methods converge to similar levels of g -means when nearly all training instances
are used, and applying an early stopping criteria would have little, if any, effect
on their training times.
Since AL involves discarding some instances from the training set, it can
be perceived as a type of under-sampling method. Unlike traditional US, which
discards majority samples randomly, AL performs an intelligent search for the
most informative ones adaptively in each iteration according to the current hyper-
plane. Datasets where class imbalance ratio is high such as corn , wheat , letter ,
and satimage observe significant decrease in PRBEP of US (Table 6.3). Note that
US's under-sampling rate for the majority class in each category is set to the same
value as the final support vector ratio where AL reaches in the early stopping
point and RS reaches when it sees the entire training data. Although the class
imbalance ratios provided to the learner in AL and US are the same, AL achieves
significantly better PRBEP performance metric than US. The Wilcoxon-signed-
rank test (two-tailed) reveals that the zero median hypothesis can be rejected at
the significance level 1% ( p = 0.0015), implying that AL performs statistically
better than US in these 18 datasets. These results reveal the importance of using
the informative instances for learning.
Table 6.2 gives the comparison of the computation times of the AL and
SMOTE. Note that SMOTE requires significantly long preprocessing time that
dominates the training time in large datasets, for example, MNIST-8 dataset. The
low computation cost, scalability, and high prediction performance of AL suggest
that AL can efficiently handle the class imbalance problem.
Search WWH ::




Custom Search