CLASS IMBALANCE AND ACTIVE LEARNING - Imbalanced Learning: Foundations, Algorithms, and Applications

Information Technology Reference

In-Depth Information

Alternatively, we could consider general volatility in class members'

predicted labels, beyond improvement in the model's ability to predict the class.

Again, using cross-validated predictions at successive epochs, it is possible

to isolate members of each class, and observe changes in the predicted class

for each instance. For example, when the predicted label of a given instance

changes between successive epochs, we can deem the instance to have been

redistricted [38-40]. Again considering the level of volatility in a model's

predictions to be a measurement of uncertainty, we can sample classes at epoch

t according to each classes' proportional measure of redistricting:

x ∈ c I (f t − 1 (x) = f t − 2 (x))

p t

(c)

∝

x ∈ c I (f t − 1 (x) = f t − 2 (x)) ,

| c |

where I ( · ) is an indicator function taking the value of 1 if its argument is true

and 0 otherwise. f t − 1 (x) and f t − 2 (x) are the predicted labels, for instance, x

from the models trained at epoch t −

1and t −

2, respectively [38-40].

6.8.1.2 Expected Class Utility The previously described ACS heuristics are

reliant on the assumption that adding examples belonging to a particular class will

improve the predictive accuracy with respect to that class. This does not directly

estimate the utility of adding members of a particular class to a model's overall

performance. Instead, it may be preferable to select classes whose instances'

presence in the training set will reduce a model's misclassification cost by the

greatest amount in expectation.

Let cost (c i

c j ) be the cost of predicting c i on an instance x whose true label

is c j . Then the expected empirical misclassification cost over a sample dataset,

D ,is:

|D|

R =

P(c i | x) cost (c i | y),

∈D

where y is the correct class for a given x . Typically in the ACS setting, this

expectation would be taken over the training set (e.g.,

D = T ), preferably using

cross-validation. In order to reduce this risk, we would like to select examples

from class c , leading to the greatest reduction in this expected risk [39].

Consider a predictive model P T ∪ c ( ·| x) , a model built on the training set, T ,

supplemented with an arbitrary example belonging to class c . Given the oppor-

tunity to choose an additional class-representative example to the training pool,

we would like to select the class that reduces the expected risk by the greatest

amount:

argmax

U(c),

where

|D|

P T (c i | x) cost (c i | y) −

P T ∪ c (c i | x) cost (c i | y).

U(c) =

∈D

Imbalanced Learning: Foundations, Algorithms, and Applications

Search WWH ::

Custom Search

Home