Databases Reference
In-Depth Information
Note that the confidence is calculated based on the whole database. Then
we apply the Apriori algorithm [2] within each partition
P
(
k
)
to generate the
class rule set R
(
k
)
=
r
(
k
)
:
R
(
k
)
=
Apriori
P
(
k
)
,MinSupp
(
k
)
,k
= 1
,N
TC
;
{
}
(14)
r
(
k
)
:
F
(
k
)
C
(
k
)
,
= 1
,N
R
(
k
)
,
=
{
}
→
(15)
where
N
R
(
k
)
denotes the number of rules in the rule set
R
(
k
)
and
MinSupp
(
k
)
denotes the minimum support value that each rule in
R
(
k
)
must satisfy.
The final rule set
R
ARM
is the union of the class rule sets, that is,
N
TC
R
(
k
)
.
R
ARM
=
(16)
k
=1
This procedure is what we refer to as
partitioned-ARM
.
An antecedent
F
(
k
)
of a rule
r
(
k
)
is of the form
F
(
k
)
=
<f
(
k
)
1
,f
(
k
)
2
,...,f
(
k
)
N
F
>,
(17)
and we assume that each feature value
f
(
k
)
j
can assume either a singleton value
Θ
(
i
)
f
j
,i
= 1
,n
f
j
, or the completely ambiguous value
Θ
f
j
. In other words, the
only feature imperfection we allow is a missing value which is modeled as a
complete ambiguity. This is in contrast to the class label ambiguity where
both partial and complete ambiguities are allowed.
2.5 Rule Pruning
While the classification accuracy of the classifier is a function of the integrity of
the rule set, its e
ciency and the computational burden it imposes depends on
the size of the rule set. Hence, to increase e
ciency and reduce the associated
computational burden, we propose to prune the rules.
Consider the antecedent
F
(
k
)
of the rule
r
(
k
)
. Then,
f
(
k
)
j
=
Θ
f
j
indicates
complete ambiguity regarding the feature value
f
(
k
)
j
. In this case, the feature
value does not provide any valuable information. To formalize the relevant
notions, we introduce
Definitio
n 1 (Le
vel o
f Abst
raction (LoA)).
For a given rule r
(
k
)
: F
(
k
)
→
C
(
k
)
,
= 1
,N
R
(
k
)
,k
= 1
,N
TC
, define
f
(
k
)
j
=
Θ
f
j
LoA
[
r
(
k
)
:
f
(
k
)
j
]=
.
(18)
j