CLASS IMBALANCE AND ACTIVE LEARNING - Imbalanced Learning: Foundations, Algorithms, and Applications

Information Technology Reference

In-Depth Information

probability estimates should be properly calibrated to be aligned with the test data,

if possible [43, 49, 50].

Prior work has repeatedly demonstrated the benefits of performing ACS

beyond simply selecting random examples from an example pool for acquisition

or simply using uniformly balanced selection. However, in many cases, simply

casting what would typically be an AL problem into an ACS problem, and

selecting examples uniformly among the classes can provide results far better

than what would be possible with AL alone. For instance, the learning curves

presented in Figure 6.12 compare such uniform guided learning with AL and

simple random selection. Providing the model with an essentially random but

class-balanced training set far exceeds the generalization performance possible

for an AL strategy or by random selection once the class skew becomes

substantial. More intelligent ACS strategies may make this difference even more

pronounced, and should be considered if the development effort associated with

incorporating such strategies would be outweighed by the savings coming from

reduced data acquisition costs.

6.8.2 Feature-Based Learning and Active Dual Supervision

While traditional supervised learning is by far the most prevalent classification

paradigm encountered in the research literature, it is not the only approach for

incorporating human knowledge into a predictive system. By leveraging, for

instance, class associations with certain feature values, predictive systems can be

trained that offer potentially excellent generalization performance without requir-

ing the assignment of class labels to individual instances. Consider the example

domain of predicting the sentiment of movie reviews. In this context, it is clear

that the presence of words such as “amazing” and “thrilling” carries an associ-

ation with the positive class, while terms such as “boring” and “disappointing”

evoke negative sentiment [51]. Gathering this kind of annotation leverages an

oracle's prior experience with the class polarity of certain feature values—in

this case, the emotion that certain terms tend to evoke. The systematic selec-

tion of feature values for labeling by a machine learning system is referred to

as active feature-value labeling , 11 . The general setting where class associations

are actively sought for both feature values and particular examples is known

as ADSs . The process of selection for AFL and ADS is shown in Figures 6.15

and 6.16, respectively.

Of course, incorporating the class polarities associated with certain feature

values typically requires specialized models whose functional form has been

designed to leverage feature-based background knowledge. While a survey of

models for incorporating such feature- value/class polarities is beyond the scope

of this chapter, an interested reader is advised to seek any number of related

papers (cf. [52-58]). However, while sophisticated models of this type have

11 For brevity, this is often shortened as AFL, a moniker that is best suited for domains with binary

features.

Search WWH ::

Custom Search

Home