Evaluation of Classification Trees - Data Mining with Decision Trees: Theory and Applications

Database Reference

In-Depth Information

Recall

Precision

Fig. 4.1

A typical precision-recall diagram.

Based on the above definitions, the accuracy can be defined as a

function of sensitivity and specificity :

positive

positive + negative

negative

positive + negative

.

(4.5)

Accuracy = Sensitivity

·

+ Specificity

·

4.2.4

The F-Measure

Usually, there is a tradeoff between the precision and recall measures.

Trying to improve one measure often results in a deterioration of the second

measure. Figure 4.1 illustrates a typical precision-recall graph. This two-

dimensional graph is closely related to the well-known receiver operating

characteristics (ROC) graphs in which the true positive rate (recall) is

plotted on the Y -axis and the false positive rate is plotted on the X -axis

[ Ferri et al . (2002) ] . However, unlike the precision-recall graph, the ROC

diagram is always convex.

Given a probabilistic classifier, this trade-off graph may be obtained

by setting different threshold values. In a binary classification problem, the

classifier prefers the class “not pass” over the class “pass” if the probability

for “not pass” is at least 0.5. However, by setting a different threshold value

other than 0.5, the trade-off graph can be obtained.

The problem here can be described as multi-criteria decision-making

(MCDM). The simplest and the most commonly used method to solve

MCDM is the weighted sum model. This technique combines the criteria

into a single value by using appropriate weighting. The basic principle

behind this technique is the additive utility assumption. The criteria

measures must be numerical, comparable and expressed in the same unit.

Nevertheless, in the case discussed here, the arithmetic mean can mislead.

Instead, the harmonic mean provides a better notion of “average”. More

specifically, this measure is defined as [Van Rijsbergen (1979)]:

2

R

P + R

·

P

·

F =

.

(4.6)

Data Mining with Decision Trees: Theory and Applications

Search WWH ::

Custom Search

Home