Information Technology Reference
In-Depth Information
Acq
Corn
Crude
98
90
90
96
80
94
92
90
88
80
70
70
60
60
0
2000
4000
6000
1000
3000
5000
7000
0
2000
4000
6000
100
95
Earn
Grain
Interest
99
90
90
85
80
75
80
98
70
97
60
96
0
2000
4000
6000
0
2000
4000
6000
0
2000
4000
6000
90
90
Ship
Trade
Money-fx
90
85
80
80
80
70
75
70
60
70
65
60
50
0
2000
4000
6000
1000
3000
5000
7000
0
2000
4000
6000
Wheat
90
SMOTE
85
Active learning
V IRUTAL
80
75
0
2000
4000
6000
Figure 6.9 Comparison of SMOTE, AL, and VIRTUAL on 10 largest categories of
Reuters-21578 . We show the g -means (%) ( y -axis) of the current model for the test
set versus the number of training samples ( x -axis) seen.
Table 6.4 presents the g -means and the total learning time for SMOTE, AL,
and VIRTUAL. Classical batch SVM's g -means values are also provided as a
reference point. In Reuters datasets, VIRTUAL yields the highest g -means in all
categories. Table 6.4 shows the effectiveness of adaptive virtual instance gen-
eration. In categories corn , interest ,and ship with high class imbalance ratio,
VIRTUAL gains substantial improvement in g -means. Compared to AL, VIRTUAL
requires additional time for the creation of virtual instances and selection of those
that may become support vectors. Despite this overhead, VIRTUAL's training times
are comparable with those of AL. In the cases where minority examples are abun-
dant, SMOTE demands substantially longer time to create virtual instances than
VIRTUAL. But as the rightmost columns in Table 6.3 show, only a small fraction
Search WWH ::




Custom Search