Supervised Pattern Mining and Applications to Classification - Frequent Pattern Mining

Database Reference

In-Depth Information

Supervised Pattern Mining

The majority of texts in this topic deal with different unsupervised pattern mining

settings. We will quickly repeat the relevant definitions here to clarify which setting

we discuss:

Definition 2.1

L D which describes the syntax of potential

transactions in the data, a transactional data set

Given a data language

D ⊆ L D

is of the form

D =

{

d 1 , ... , d n }

, d i

∈ L D . Given a pattern language

L π , we define a function match :

L π × L D →{

, which decides whether a pattern occurs in a transaction or not.

The set of transactions from a data set

0, 1

}

matched by a pattern π are referred to

as its cover: cov D ( π )

∈ D |

match ( π , d )

}

, and the size of the cover is

referred to as π 's (absolute) support: supp D ( π )

The easiest instantiation of this definition is the case of itemset databases: given

a set of items

cov D ( π )

2 I , and match ( π , d )

d . For other types

of data, such as for instance graph data or sequential data, alternative definitions for

L D ,

L π

= L D =

⇔

⊆

L π and match can be used, and most ideas presented in the rest of this paper

can be applied immediately for these alternative definitions.

The biggest difference between unsupervised and supervised pattern mining is

the presence of a variable of interest. This variable is often the class variable that can

take on one out of several nominal class labels .

Definition 2.2

Given a data language

L D , and a set of class labels

C ={ C 1 ,

... , C k }

, a labeled data set

D C is of the form

D C

( d 1 , c 1 ), ... ,( d n , c n )

}

, d i

∈

L D , c i ∈ C

The most common setting is that of classification , in which the task is to learn

a mechanism to predict the class label for unseen data based on rules or patterns.

Alternatively, the target for prediction can also be numerical, requiring a regression

model.

However, another popular setting is that of subgroup discovery , which can be

generalized to exceptional model mining when the target attribute is not a single

categorical attribute [ 24 ].

Instead of prediction, the goal in this setting is the characterization of subsets

of the data, i.e. subgroups. The mined rules are therefore not means to the end of

prediction but the end themselves, and users are expected to inspect them to gain a

deeper understanding of the data. In other words, classification is concerned with

outcomes on future data, subgroup discovery with descriptions of current data.

As a result of this, the quality criteria and heuristics used are sometimes different.

However, many of the techniques used are also shared, and for reasons of clarity

of presentation, we will mainly focus on classification in this chapter, and make

differences to the other settings explicit when appropriate.

Frequent Pattern Mining

Search WWH ::

Custom Search

Home