Databases Reference
In-Depth Information
X
within sequence
SID
,and
Class
is the class label associated to sequence
SID
. (b) Two flags,
isClosed
and
isGenerator
, stating when sequence
X
is a candidate closed or generator sequence, respectively. (c) The set
G
(
X
)
including the sequences which are generators of
X
.
The proposed algorithm has a structure similar to GSP [5], where sequence
mining is performed by means of a levelwise search. To increase the e
ciency
of our approach, we associate to each sequence an id-list similar to the one
in [17].
A sequence
X
generates a set of classification rules having
X
as antecedent,
and the class labels in the id-list of
X
as consequent. The support of
X
(
sup
(
X
)) is the number of different
SIDs
in the id-list of
X
.Forarule
r
:
X → c
, the support (
sup
(
r
)) is the number of different
SIDs
in the id-list
labeled by the class label
c
. The confidence is given by
conf(r)=sup(r)/sup(X)
.
The algorithm, whose pseudocode is shown in Fig. 1, is described in the
following. As a preliminary step, we compute the set
1
of 1-sequences which
encodes at least one frequent classification rule (line 3). All sequences in
M
1
M
1
,the
are generator sequences by Definition 7. For each sequence
X
∈M
set
(
X
) of its generator sequences is initialized with the sequence itself. All
sequences in
G
1
are also candidate closed sequences by Definition 5. Hence,
both flags
isClosed
and
isGenerator
are set to true.
M
k
+1
.
At iteration
k
+1 we generate set
k
+1
by joining
k
with
Generating
M
M
M
k
. Function
generate cand closed
(line 10) generates a new (k+1)-sequence
M
k
.
Our generation method is based on the contiguous subsequence concept
(similar to GSP [5]). Sequence
Z
k
+1
by combining two k-sequences
X,Y
Z
∈M
∈M
k
+1
∈M
is generated from two sequences
1.
CompactForm Miner
(
D
,
minsup
,
minconf
,
maxgap
)
2.
{CRC
=
CCRS
=
∅
;
3.
k=1;
1
=
compute M
1
(
D, minsup
);
4.
M
1
5.
for all
X ∈M
6.
CRC
=
CRC ∪{extract general rules
(
X
,
minsup
,
minconf
)
}
;
k
7.
while(
M
=
∅
)
k
+1
=
∅
;
8.
{M
k
M
k
9.
for all (
X, Y
)
∈M
10.
{Z
=
generate cand closed
(
X
,
Y
,
maxgap
);
11.
if (
support pruning
(
Z
,
minsup
)==false) then
k
+1
=
M
k
+1
12.
{M
∪{Z}
;
13.
evaluate closure
(
Z
,
X
,
Y
);
}}
k
with
X.isClosed
==
true
14.
for all
X ∈M
15.
CCRS
=
CCRS ∪{extract compact rules
(
X
,
minsup
,
minconf
)
}
;
k
+1
with
X.isGenerator
==
true
16.
for all
X ∈M
17.
CRC
=
CRC ∪{extract general rules
(
X
,
minsup
,
minconf
)
}
;
18.
k= k+1;
}
Fig. 1.
Compact form mining algorithm