Information Technology Reference
In-Depth Information
Adult (lmb.Rat.:3)
Adult (lmb.Rat.:10)
70
70
65
65
60
60
55
55
50
50
45
45
40
40
RS
RS
AL
AL
35
35
0
0.5 1 1.5
#Training instances
2
2.5
3
0
0.5
1 1.5
#Training instances
2
2.5
3
×
10 4
10 4
×
Adult (lmb.Rat.:20)
Adult (lmb.Rat.:30)
70
70
65
65
60
60
55
55
50
50
45
45
40
40
RS
AL
RS
AL
35
35
0
0.5 1 1.5
#Training instances
2
2.5
3
×
0
0.5
1 1.5
#Training instances
2
2.5
3
×
10 4
10 4
Figure 6.4 Comparison of PRBEP of AL and RS on the adult datasets with different
imbalance ratios (Imb.R.=3, 10, 20, 30).
training set, AL can achieve similar or even higher generalization performance
than that of batch, which sees all the training examples. Another important obser-
vation from Table 6.1 is that support vector imbalance ratios in the final models
are much less than the class imbalance ratios of the datasets. This confirms the
discussion of Figure 6.3. The class imbalance ratio within the margins is much
less than that of the entire data, and AL can be used to reach those informative
examples that most likely become support vectors without seeing all the training
examples.
Figure 6.5 investigates how the number of support vectors changes when
presented with examples selected according to AL and RS. Because the base rate
of the dataset gathered by RS approaches that of the example pool, the support
vector imbalance ratio quickly approaches the data imbalance ratio. As learning
continues, the learner should gradually see all the instances within the final margin
and the support vector imbalance ratio decreases. At the end of training with RS,
the support vector imbalance ratio is the data imbalance ratio within the margin.
The support vector imbalance ratio curve of AL is drastically different than RS.
AL intelligently picks the instances closest to the margin in each step. Since the
data imbalance ratio within the margin is lower than data imbalance ratio, the
support vectors in AL are more balanced than RS during learning. Using AL,
the model saturates by seeing only 2000 (among 7770) training instances and
Search WWH ::




Custom Search