CLASS IMBALANCE AND ACTIVE LEARNING - Imbalanced Learning: Foundations, Algorithms, and Applications - page 133

Information Technology Reference

In-Depth Information

by allowing the oracle to interact with the base learner, confusing instances,

those that “fool” the model can be sought out from the problem space and used

for subsequent training in the form of human-guided uncertainty sampling. This

interaction with the base learner can be extended a step further—by allowing the

humans to challenge the predictive accuracy of the problem space may potentially

reveal “problem areas,” portions of the example space where the base model per-

forms poorly that might not be revealed through traditional techniques such as

cross-validation studies [42].

Guided learning, along with alternative problem settings such as that faced by

the artificial nose discussed earlier deals with situations where an oracle is able

to provide “random” examples in arbitrary class proportions. It now becomes

interesting to consider just what this class proportion should be? This problem

appears to face the inverse of the difficulties faced by AL—labels essentially

come for free, while the independent feature values are completely unknown and

must be gathered at a cost. In this setting, it becomes important to consider the

question: “In what proportion should classes be represented in a training set of

a certain size?” [43].

Let us call the problem of proportioning class labels in a selection of n

additional training instances, “active class selection” (ACS) [38-40, 43]. This

process is exemplified in Figure 6.14. In this setting, large, class-conditioned

(virtual) pools of available instances with completely hidden feature values are

assumed. At each epoch, t , of the ACS process, the task is to leverage the cur-

rent model when selecting examples from these pools in a proportion believed

to have the greatest effectiveness for improving the generalization performance

Feature values

−

+

−

+

+

Training set

Model

+

+

+

−

−

−

−

Unexplored instances

Figure 6.14 Active class selection: gathering instances from random class-conditioned

fonts in a proportion believed to offer greatest improvement in generalization performance.

Next Page

Imbalanced Learning: Foundations, Algorithms, and Applications

Search WWH ::

Custom Search

Home