Information Technology Reference
In-Depth Information
Of course the benefit of adding additional examples on a test dataset is
unknown. Furthermore, the impact of a particular class's examples may vary
depending on the feature values of particular instances. In order to cope with
these issues, we can estimate via cross-validation on the training set. Using sam-
pling, we can try various class-conditional additions and compute the expected
benefit of a class across that class's representatives in T , assessed on the testing
folds. The earlier-mentioned utility then becomes:
U(c) = E x c 1
|D|
P T c (c i | x) cost (c i | y) .
1
|D|
P T (c i | x) cost (c i | y)
x
∈D
i
x
∈D
i
Note that it is often preferred to add examples in batch. In this case, we may
wish to sample from the classes in proportion to their respective utilities:
U(c)
c
p t
U (c)
.
U(c)
Further, diverse class-conditional acquisition costs can be incorporated,
utilizing U(c)/ω c in place of U(c) , where ω c is the (expected) cost of acquiring
the feature vector of an example in class c .
6.8.1.3 Alternative Approaches to ACS In addition to uncertainty-based and
utility-based techniques, there are several alternative techniques for performing
ACS. Motivated by empirical results showing that barring any domain-specific
information, when collecting examples for a training set of size n , a balanced
class distribution tends to offer reasonable AUC on test data [43, 47], a reasonable
baseline approach to ACS is simply to select classes in balanced proportion.
Search strategies may alternately be employed in order to reveal the most
effective class ratio at each epoch. Utilizing a nested cross-validation on the
training set, the space of class ratios can be explored, with the most favorable
ratio being utilized at each epoch. Note that it is not possible to explore all
possible class ratios in all epochs, without eventually spending too much on
one class or another. Thus, as we approach n , we can narrow the range of class
ratios, assuming that there is a problem-optimal class ratio that will become more
apparent as we obtain more data [43].
It should be noted that many techniques employed for building classification
models assume an identical or similar training and test distribution. Violating this
assumption may lead to biased predictions on test data where classes preferen-
tially represented in the training data are predicted more frequently. In particular,
increasing the prior probability of a class increases the posterior probability of
the class, moving the classification boundary for that class so that more cases
are classified into that class” [48, 49]. Thus in settings where instances are
selected specifically in proportions different from those seen in the wild, posterior
Search WWH ::




Custom Search