Database Reference
In-Depth Information
Each of these classes can be treated like a distinct data set—the ARC-BC algo-
rithm by [ 3 ], for instance, mines car s from each class separately, using a single
relative support threshold that is used as a constraint on each class in turn. Using this
interpretation also opens up several new possibilities.
The first, and potentially most important one, is that this opens up the supervised
pattern mining setting to all possible pattern languages: whether itemsets, sequences,
trees, or graph-structured data and patterns, the techniques that we describe in this
section are applicable to all of them.
Second, there are new ways of using significance and quality measures.
Multiple Support Thresholds There is the possibility of using support thresholds.
The XRules classifier [ 41 ], for instance, uses a separate minimum support threshold
for each class. It is also a first example of supervised pattern mining in a different
pattern domain than itemsets, producing predictive rules the rule body of which
consists of tree fragments, called structural rules in the work.
Instead of minimum support constraints, it is also natural to use maximum support
constraints: a rule which is specific for one class should after all not cover many
examples in other classes than the class it is predicting. The technique introduced
by [ 22 ], for instance, exploits this observation by finding patterns that are frequent
within one class, but infrequent in the other. It exploits a relationship with version
space theory from machine learning.
The CCCS classifier [ 4 ] even relies only on a maximum support constraint and
removes the minimum support constraint entirely. It is argued that infrequent patterns
in a class can be found by enumerating small subsets of transactions in this class.
The problem that remains in each of these cases is a similar one as for single sup-
port thresholds: how to set the parameters. A pattern that occurs in 50 % of one class,
and 15 % of the other, could be considered a valuable predictive pattern, as might be a
pattern that occurs in 80 % of the first and 30 % of the second. Support constraints that
accommodate both patterns, however, e.g. supp min
=
0 . 5, supp max
=
0 . 3 would
allow results of questionable usefulness.
To address this, the Fitcare classifier proposed by [ 10 ] takes this idea further and
uses a much larger parameter set: given k classes, each class is mined separately,
parametrized by a minimum support constraint and k
1 maximum support con-
straints on all other classes. To make this manageable, the support constraints are
dynamically adjusted during mining.
Statistical Measures A popular alternative approach is the use of constraints on
measures specifically designed for supervised data. These measures typically serve as
a replacement for confidence in selecting relevant predictive patterns; the underlying
patterns are still found using a minimum support threshold on the complete data.
As a straightforward example, consider the accuracy measure:
D + ,
D , pattern r. The accuracy of r is defined
Definition 2.5
Given two classes
( r ) + ( | D |− supp
supp
( r ))
D +
D
as acc ( r )
.
In general, most measures for evaluating the predictive power of a rule can be
expressed as functions from the values in the contingency table :
=
| D |
Search WWH ::




Custom Search