Database Reference
In-Depth Information
we will see below, certain type of decision trees can be used to solve the
instability problem.
4.8
Interestingness Measures
The number of classification patterns generated could be very large and it
is possible that different approaches result in different sets of patterns. The
patterns extracted during the classification process could be represented in
the form of rules, known as classification rules. It is important to evaluate
the discovered patterns identifying the ones that are valid and provide
new knowledge. Techniques that aim at this goal are broadly referred
to as interestingness measures and the interestingness of the patterns
that are discovered by a classification approach may also be considered
as another quality criterion. Some representative measures [ Hilderman
and Hamilton, 1999] for ranking the usefulness and utility of discovered
classification patterns (each path from the root to the leaf represents a
different pattern) are:
Rule-Interest Function . Piatetsky-Shapiro introduced the rule-interest
[Piatetsky-Shapiro, (1991)] that is used to quantify the correlation
between attributes in a classification rule. It is suitable only for single
classification rules, i.e. rules where both the left- and right-hand sides
correspond to a single attribute.
Smyth and Goodman's J-Measure . The J-measure is a measure for
probabilistic classification rules and is used to find the best rules relating
discrete-valued attributes [ Smyth and Goodman (1991) ] . A probabilistic
classification rule is a logical implication, X
Y , satisfied with
some probability p . The left- and right-hand sides of this implication
correspond to a single attribute. The right-hand side is restricted to
simple single-valued assignment expressions while the left-hand side may
be a conjunction of simple expressions.
General Impressions .InLiu et al . (1997), general impression is proposed
as an approach for evaluating the importance of classification rules. It
compares discovered rules to an approximate or vague description of
what is considered to be interesting. Thus a general impression can be
considered as a kind of specification language.
Gago and Bento's Distance Metric . The distance metric measures the
distance between classification rules and is used to determine the rules
that provide the highest coverage for the given data. The rules with the
Search WWH ::




Custom Search