Evaluation of Classification Trees - Data Mining with Decision Trees: Theory and Applications

Database Reference

In-Depth Information

who may be carrying dangerous instruments (such as scissors, penknives

and shaving blades). For this purpose the ocer is using a classifier that is

capable of classifying each passenger either as class A, which means, “Carry

dangerous instruments” or as class B, “Safe”.

Suppose that searching a passenger is a time-consuming task and that

the security ocer is capable of checking only 20 passengers prior to each

flight. If the classifier has labeled exactly 20 passengers as class A, then the

ocer will check all these passengers. However, if the classifier has labeled

more than 20 passengers as class A, then the ocer is required to decide

which class A passenger should be ignored. On the other hand, if less than

20 people were classified as A, the ocer, who must work constantly, has

to decide who to check from those classified as B after he has finished with

the class A passengers.

There are also cases in which a quota limitation is known to exist but its

size is not known in advance. Nevertheless, the decision maker would like

to evaluate the expected performance of the classifier. Such cases occur,

for example, in some countries regarding the number of undergraduate

students that can be accepted to a certain department in a state university.

The actual quota for a given year is set according to different parameters

including governmental budget. In this case, the decision maker would like

to evaluate several classifiers for selecting the applicants while not knowing

the actual quota size. Finding the most appropriate classifier in advance

is important because the chosen classifier can dictate what the important

attributes are, i.e. the information that the applicant should provide the

registration and admission unit.

In probabilistic classifiers, the above-mentioned definitions of precision

and recall can be extended and defined as a function of a probability

threshold τ . If we evaluate a classifier based on a given test set which

consists of n instances denoted as ( <x 1 ,y 1 >,...,<x n ,y n > ) such that x i

represents the input features vector of instance i and y i represents its true

class (“positive” or “negative”), then:

<x i ,y i > : P DT ( pos

Precision ( τ )= |{

|

x i ) >τ,y i = pos

}|

,

(4.8)

P DT ( pos

|{

<x i ,y i > :

|

x i ) >τ

|

<x i ,y i > : P DT ( pos

Recall ( τ )= |{

|

x i ) >τ,y i = pos

}|

,

(4.9)

|{

<x i ,y i > : y i = pos

}|

where DT represents a probabilistic classifier that is used to estimate the

conditional likelihood of an observation x i to “positive” which is denoted as

Data Mining with Decision Trees: Theory and Applications

Search WWH ::

Custom Search

Home