Generality Is Predictive of Prediction Accuracy - Data Mining: Theory, Methodology, Techniques, and Applications - page 10

Database Reference

In-Depth Information

this should be handled. However, we contend that practitioners are better off

aware of this trade-off than making decisions in ignorance of their consequences.

Pazzani, Murphy, Ali, and Schulenburg [12] have argued with empirical sup-

port that where a classifier has an option of not making predictions (such as

when used for identification of market trading opportunities), selection of more

specific rules can be expected to create a system that makes fewer decisions of

higher expected quality. Our hypotheses provide an explanation of this result.

When the accuracy of the rules on the training data is high, specializing the rules

can be expected to raise their accuracy on unseen data towards that obtained

on the training data.

Where a classifier must always make decisions and maximization of prediction

accuracy is desired, our results suggest that rules for the class that occurs most

frequently should be generalized at the expense of rules for alternative classes.

This is because as each rule is generalized it will trend towards the accuracy of a

default rule for that class, which will be highest for rules of the most frequently

occurring class.

Another point that should be considered, however, is alternative sources of

information that might be brought to bear upon such decisions. We have em-

phasized that our hypotheses relate only to contexts in which there is no other

evidence available to distinguish between the expected accuracy of two rules

other than their relative generality. In many cases we believe it may be possible

to derive such evidence from training data. For example, we are likely to have

differing expectations about the likely accuracy of the two alternative general-

izations depicted in Fig. 2. This figure depicts a two dimensional instance space,

defined by two attributes, A and B, and populated by training examples belong-

ing to two classes denoted by the shapes

. Three alternative rules are

presented together with the region of the instance space that each covers. In this

example it appears reasonable to expect better accuracy from the rule depicted

in Fig. 2b than that depicted in Fig. 2c as the former generalizes toward a region

of the instance space dominated by the same class as the rule whereas the latter

generalizes toward a region of the instance space dominated by a different class.

•

and

•

•

•

•

•

•

•

•

•

10

10

10

•

••

•

••

•

••

8

8

8

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

•

•

.

.

•

.

.

•

.

.

.

.

.

.

.

.

.

.

•

•

•

•

.

.

.

.

.

.

•

•

.

.

•

.

.

•

.

.

.

.

.

.

.

.

.

.

6

6

6

. . . . . . . . . . . . . . . . . . .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

B

B

B

•

•

•

4

4

4

2

2

2

2468 0

A

2468 0

A

2468 0

A

a) Initial rule:

IF 4 ≤ B ≤ 6

THEN •

b) First generalization:

IF 4 ≤ B ≤ 7

THEN •

c) Second generalization:

IF 3 ≤ B ≤ 6

THEN •

Fig. 2. Alternative generalizations to a rule

Next Page

Data Mining: Theory, Methodology, Techniques, and Applications

Search WWH ::

Custom Search

Home