Information Technology Reference
In-Depth Information
ratio that truly applies in a domain is rarely precisely known. ROC analysis gives
an evaluation of what may happen in these diverse situations.
ROC analysis was criticized by Webb and Ting [12] because of the problems
that may arise in the case of changing distributions. The issue is that the true and
false positive rates should remain invariant to changes in class distributions in
order for ROC analysis to be valid. Yet, Webb and Ting [12] argue that changes
in class distributions often also lead to changes in the true and false positive rates,
thus invalidating the results. A response to Webb and Ting [12] was written by
Fawcett and Flach [13], which alleviates their worries. Fawcett and Flach [13]
argue that within the two important existing classes of domains, the problems
pointed out by Webb and Ting [12] only apply to one of these classes, and
then not always. When dealing with class imbalances in the case of changing
distributions, it is thus recommended to be aware of the issues discussed in these
two papers, before fully trusting the results of ROC analysis. In fact, Landgrebe et
al. [14] point out another aspect of this discussion. They suggest that in imprecise
environments, when class distributions change, the purpose of evaluation is to
observe variability in performance, as the distribution changes in order to select
the best model within an expected range of priors. They argue that ROC analysis
is not appropriate for this exercise, and, instead, suggest the use of PR curves.
PR curves are discussed in Section 8.4.3, which is followed in Section 8.4.5.2 by
Landgrebe et al. [14]'s definition of PR curves summaries, which is somewhat
similar to that of the AUC.
In the meantime, the next section addresses another common complaint about
ROC analyses, namely that they are not practical to read in the case where a
class imbalance or cost ratio is known.
8.4.2 Cost Curves
The issue of ROC analysis readability is taken into account by cost curves. What
makes cost curves attractive is their ease of use in determining the best classifier
to use in situations where the error costs or class distribution, or more generally
the skew, are known. For example, in Figure 8.2, while it is clear that the curve
corresponding to classifier f 2 dominates the curve corresponding to classifier f 1
at first and that the situation is inverted afterward, this information cannot easily
be translated into information telling us for what costs and class distributions
classifier f 2 performs better than classifier f 1 . Cost curves do provide this kind
of information.
In particular, cost curves plot the relative expected misclassification costs as
a function of the proportion of positive instances in the dataset. An illustration
of a cost curve plot is given in Figure 8.3. The important thing to keep in mind
is that there is a point-line duality between cost curves and ROC curves. Cost
curves are point-line duals of the ROC curves. In ROC space, a discrete classifier
is represented by a point. The points representing several classifiers (produced by
manipulating the threshold of the base classifier) can be joined (and extrapolated)
to produce a ROC curve. In cost space, each of the ROC points is represented by
Search WWH ::




Custom Search