Database Reference
In-Depth Information
this should be handled. However, we contend that practitioners are better off
aware of this trade-off than making decisions in ignorance of their consequences.
Pazzani, Murphy, Ali, and Schulenburg [12] have argued with empirical sup-
port that where a classifier has an option of not making predictions (such as
when used for identification of market trading opportunities), selection of more
specific rules can be expected to create a system that makes fewer decisions of
higher expected quality. Our hypotheses provide an explanation of this result.
When the accuracy of the rules on the training data is high, specializing the rules
can be expected to raise their accuracy on unseen data towards that obtained
on the training data.
Where a classifier must always make decisions and maximization of prediction
accuracy is desired, our results suggest that rules for the class that occurs most
frequently should be generalized at the expense of rules for alternative classes.
This is because as each rule is generalized it will trend towards the accuracy of a
default rule for that class, which will be highest for rules of the most frequently
occurring class.
Another point that should be considered, however, is alternative sources of
information that might be brought to bear upon such decisions. We have em-
phasized that our hypotheses relate only to contexts in which there is no other
evidence available to distinguish between the expected accuracy of two rules
other than their relative generality. In many cases we believe it may be possible
to derive such evidence from training data. For example, we are likely to have
differing expectations about the likely accuracy of the two alternative general-
izations depicted in Fig. 2. This figure depicts a two dimensional instance space,
defined by two attributes, A and B, and populated by training examples belong-
ing to two classes denoted by the shapes
. Three alternative rules are
presented together with the region of the instance space that each covers. In this
example it appears reasonable to expect better accuracy from the rule depicted
in Fig. 2b than that depicted in Fig. 2c as the former generalizes toward a region
of the instance space dominated by the same class as the rule whereas the latter
generalizes toward a region of the instance space dominated by a different class.
and
10
10
10
••
••
••
8
8
8
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6
6
6
. . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
B
B
B
4
4
4
2
2
2
2468 0
A
2468 0
A
2468 0
A
a) Initial rule:
IF 4 ≤ B ≤ 6
THEN •
b) First generalization:
IF 4 ≤ B ≤ 7
THEN •
c) Second generalization:
IF 3 ≤ B ≤ 6
THEN •
Fig. 2. Alternative generalizations to a rule
 
Search WWH ::




Custom Search