Database Reference
In-Depth Information
2
Supervised Pattern Mining
The majority of texts in this topic deal with different unsupervised pattern mining
settings. We will quickly repeat the relevant definitions here to clarify which setting
we discuss:
Definition 2.1
L D which describes the syntax of potential
transactions in the data, a transactional data set
Given a data language
D L D
is of the form
D =
{
d 1 , ... , d n }
, d i
L D . Given a pattern language
L π , we define a function match :
L π × L D →{
, which decides whether a pattern occurs in a transaction or not.
The set of transactions from a data set
0, 1
}
D
matched by a pattern π are referred to
as its cover: cov D ( π )
={
d
D |
match ( π , d )
=
1
}
, and the size of the cover is
referred to as π 's (absolute) support: supp D ( π )
.
The easiest instantiation of this definition is the case of itemset databases: given
a set of items
=|
cov D ( π )
|
2 I , and match ( π , d )
d . For other types
of data, such as for instance graph data or sequential data, alternative definitions for
L D ,
I
,
L π
= L D =
=
1
π
L π and match can be used, and most ideas presented in the rest of this paper
can be applied immediately for these alternative definitions.
The biggest difference between unsupervised and supervised pattern mining is
the presence of a variable of interest. This variable is often the class variable that can
take on one out of several nominal class labels .
Definition 2.2
Given a data language
L D , and a set of class labels
C ={ C 1 ,
... , C k }
, a labeled data set
D C is of the form
D C
={
( d 1 , c 1 ), ... ,( d n , c n )
}
, d i
L D , c i C
.
The most common setting is that of classification , in which the task is to learn
a mechanism to predict the class label for unseen data based on rules or patterns.
Alternatively, the target for prediction can also be numerical, requiring a regression
model.
However, another popular setting is that of subgroup discovery , which can be
generalized to exceptional model mining when the target attribute is not a single
categorical attribute [ 24 ].
Instead of prediction, the goal in this setting is the characterization of subsets
of the data, i.e. subgroups. The mined rules are therefore not means to the end of
prediction but the end themselves, and users are expected to inspect them to gain a
deeper understanding of the data. In other words, classification is concerned with
outcomes on future data, subgroup discovery with descriptions of current data.
As a result of this, the quality criteria and heuristics used are sometimes different.
However, many of the techniques used are also shared, and for reasons of clarity
of presentation, we will mainly focus on classification in this chapter, and make
differences to the other settings explicit when appropriate.
Search WWH ::




Custom Search