Supervised Pattern Mining and Applications to Classification - Frequent Pattern Mining

Database Reference

In-Depth Information

Each of these classes can be treated like a distinct data set—the ARC-BC algo-

rithm by [ 3 ], for instance, mines car s from each class separately, using a single

relative support threshold that is used as a constraint on each class in turn. Using this

interpretation also opens up several new possibilities.

The first, and potentially most important one, is that this opens up the supervised

pattern mining setting to all possible pattern languages: whether itemsets, sequences,

trees, or graph-structured data and patterns, the techniques that we describe in this

section are applicable to all of them.

Second, there are new ways of using significance and quality measures.

Multiple Support Thresholds There is the possibility of using support thresholds.

The XRules classifier [ 41 ], for instance, uses a separate minimum support threshold

for each class. It is also a first example of supervised pattern mining in a different

pattern domain than itemsets, producing predictive rules the rule body of which

consists of tree fragments, called structural rules in the work.

Instead of minimum support constraints, it is also natural to use maximum support

constraints: a rule which is specific for one class should after all not cover many

examples in other classes than the class it is predicting. The technique introduced

by [ 22 ], for instance, exploits this observation by finding patterns that are frequent

within one class, but infrequent in the other. It exploits a relationship with version

space theory from machine learning.

The CCCS classifier [ 4 ] even relies only on a maximum support constraint and

removes the minimum support constraint entirely. It is argued that infrequent patterns

in a class can be found by enumerating small subsets of transactions in this class.

The problem that remains in each of these cases is a similar one as for single sup-

port thresholds: how to set the parameters. A pattern that occurs in 50 % of one class,

and 15 % of the other, could be considered a valuable predictive pattern, as might be a

pattern that occurs in 80 % of the first and 30 % of the second. Support constraints that

accommodate both patterns, however, e.g. supp min

=

0 . 5, supp max

=

0 . 3 would

allow results of questionable usefulness.

To address this, the Fitcare classifier proposed by [ 10 ] takes this idea further and

uses a much larger parameter set: given k classes, each class is mined separately,

parametrized by a minimum support constraint and k

1 maximum support con-

straints on all other classes. To make this manageable, the support constraints are

dynamically adjusted during mining.

−

Statistical Measures A popular alternative approach is the use of constraints on

measures specifically designed for supervised data. These measures typically serve as

a replacement for confidence in selecting relevant predictive patterns; the underlying

patterns are still found using a minimum support threshold on the complete data.

As a straightforward example, consider the accuracy measure:

D + ,

D − , pattern r. The accuracy of r is defined

Definition 2.5

Given two classes

( r ) + ( | D − |− supp

supp

( r ))

D +

D −

as acc ( r )

.

In general, most measures for evaluating the predictive power of a rule can be

expressed as functions from the values in the contingency table :

=

| D |

Frequent Pattern Mining

Search WWH ::

Custom Search

Home