Database Reference
In-Depth Information
While our experiments have been performed in a machine learning context,
the results are applicable in wider knowledge acquisition contexts. For example,
interactive knowledge acquisition environments [3, 13] present users with alter-
native rules all of which perform equally well on example data. Where the user
is unable to bring external knowledge to bear to make an informed judgement
about the relative merits of those rules, the system is able to offer no further
advice. Our experiments suggest that relative generality is a factor that an in-
teractive knowledge acquisition system might profitably utilize.
Our experiments also demonstrate that the effect that we discuss is one that
applies frequently in real-world knowledge acquisition tasks. The alternative
rules used in our experiments were all rules of varying levels of generality that
covered exactly the same training instances. In other words, it was not possi-
ble to distinguish between these rules using traditional measures of rule quality
based on performance on a training set, such as information measures. The
only exception was the data sets for which the rules at differing levels of gen-
erality were all identical. In all such cases the results were excluded from the
win/draw/loss record reported in Tables 3 to 5. Hence the sum of the values
in each win/draw/loss record places a lower bound on the number of data sets
for which there were variants of the initial rule all of which covered the same
training instances. Thus, for at least 47 out of 50 data sets, there are variants of
the C4.5rules rule with the greatest cover that cover exactly the same training
cases. For at least 38 out of 50 data sets, there are variants of the first rule
generated by C4.5rules that cover exactly the same training cases. This effect is
not a hypothetical abstraction, it is a frequent occurrence of immediate practical
import.
In such circumstances, when it is necessary to select between alternative rules
with equal performance on the training data, one approach has been to select
the least complex rule [14]. However, some recent authors have argued that
complexity is not an effective rule quality metric [8, 15]. We argue here that
generality provides an alternative criterion on which to select between such rules,
one that allows for reasoning about the trade-offs inherent in the choice of one
rule over the other, rather than providing a blanket prescription.
5
On the Diculty of Measuring Degree of Generalization
It might be tempting to believe that our hypotheses could be extended by in-
troducing a measure of magnitude of generalization together with predictions
about the magnitude of the effects on prediction accuracy that may be expected
from generalizations of different magnitude.
However, we believe that it is not feasible to develop meaningful measures of
magnitude of generalization suitable for such a purpose. Consider, for example,
the possibility of generalizing a rule with conditions
income <
50000 by deleting either condition. Which is the greater generalization? It might
be thought that the greater generalization is the one that covers the greater
number of cases. However, if one rule covers more cases than another then there
age <
40 and
Search WWH ::




Custom Search