Databases Reference
In-Depth Information
Lemma 3 (Specialistic Rule).
Let
R
be the set of frequent sequential classi-
fication rules for
D
,andr
∈R
, r
:
X
→
c, an arbitrary rule. r is a specialistic
rule in
R
iff X is a closed sequence in
D
.
Proof.
We first prove the su
cient condition. Let
r
i
:
X
→
c
be an arbitrary
rule in
,where
X
is a closed sequence. By Definition 5, if
X
is a closed
sequence then
R
Ψ
Y
it is
sup
Φ
(
X
)
>sup
Φ
(
Y
).
Thus,
r
i
is a specialistic rule according to Definition 10. We now prove the
necessary condition. Let
r
i
:
X
∀
r
j
:
Y
→
c
in
R
, with
X
.
For the sake of contradiction, let
X
not be a closed sequence. It follows that
∃
→
c
be an arbitrary specialistic rule in
R
Ψ
Y
and
sup
Φ
(
X
)=
sup
Φ
(
Y
). Hence, from
Property 2,
{
(
SID,S,c
)
∈D|Y
Φ
S}
=
{
(
SID,S,c
)
∈D|X
Φ
S}
,and
thus
sup
Φ
(
r
i
)=
sup
Φ
(
r
j
). It follows that
r
i
is not a specialistic rule according
to Definition 10, a contradiction.
r
j
:
Y
→
c
in
R
, with
X
5.2 Sequential Classification Rule Cover
In this section we present a compact form which is based on the general rules
in a given set
. This form allows the classification of unlabeled data without
information loss with respect to the complete rule set
R
R
. Hence, it is equivalent
to
for classification purposes.
Intuitively, we say that two rule sets are equivalent if they contain the
same knowledge. When referring to a classification rule set, its knowledge is
represented by its capability in classifying an arbitrary data object
d
.Note
that
d
can be matched by different rules in
R
. Each rule
r
labels
d
with a
class
c
. The estimated accuracy of
r
in predicting the correct class is usually
given by
r
's support and confidence.
The equivalence between two rule sets can be formalized in terms of rule
cover.
R
Definition 11 (Sequential Classification Rule Cover).
Let
R
1
and
R
2
⊆
R
1
be two arbitrary sequential classification rule sets extracted from
D
.
R
2
is
a sequential classification rule cover of
r
j
∈R
2
, such that
r
i
is a specialization of r
j
according to Definition 8 and (ii) R
2
is minimal.
R
1
if (i)
∀
r
i
∈R
1
,
∃
R
1
, the two sets classify in the
same way an arbitrary data object
d
.Ifarule
r
i
∈R
1
labels
d
with class
c
,
then in
When
R
2
⊆R
1
is a classification cover of
R
2
there is a rule
r
j
,where
r
i
is a specialization of
r
j
,and
r
j
labels
d
with the same class
c
(see Lemma 1).
r
i
and
r
j
have the same support
and confidence. It follows that
R
1
and
R
2
are equivalent for classification
purposes.
We propose a compact representation of rule set
R
which includes all
general rules in
. This compact representation, named
classification rule
cover
, encodes all necessary information to perform classification, but it does
not allow the regeneration of the complete rule set
R
R
.