Databases Reference
In-Depth Information
Note that the confidence is calculated based on the whole database. Then
we apply the Apriori algorithm [2] within each partition P ( k ) to generate the
class rule set R ( k ) =
r ( k )
:
R ( k ) = Apriori P ( k ) ,MinSupp ( k ) ,k = 1 ,N TC ;
{
}
(14)
r ( k )
: F ( k )
C ( k ) , = 1 ,N R ( k ) ,
=
{
}
(15)
where N R ( k ) denotes the number of rules in the rule set R ( k ) and MinSupp ( k )
denotes the minimum support value that each rule in R ( k ) must satisfy.
The final rule set R ARM is the union of the class rule sets, that is,
N TC
R ( k ) .
R ARM =
(16)
k =1
This procedure is what we refer to as partitioned-ARM .
An antecedent F ( k )
of a rule r ( k )
is of the form
F ( k )
= <f ( k )
1 ,f ( k )
2 ,...,f ( k )
N F >,
(17)
and we assume that each feature value f ( k )
j
can assume either a singleton value
Θ ( i )
f j
,i = 1 ,n f j , or the completely ambiguous value Θ f j . In other words, the
only feature imperfection we allow is a missing value which is modeled as a
complete ambiguity. This is in contrast to the class label ambiguity where
both partial and complete ambiguities are allowed.
2.5 Rule Pruning
While the classification accuracy of the classifier is a function of the integrity of
the rule set, its e ciency and the computational burden it imposes depends on
the size of the rule set. Hence, to increase e ciency and reduce the associated
computational burden, we propose to prune the rules.
Consider the antecedent F ( k )
of the rule r ( k )
. Then, f ( k )
j
= Θ f j
indicates
complete ambiguity regarding the feature value f ( k )
j . In this case, the feature
value does not provide any valuable information. To formalize the relevant
notions, we introduce
Definitio n 1 (Le vel o f Abst raction (LoA)). For a given rule r ( k )
: F ( k )
C ( k ) , = 1 ,N R ( k ) ,k = 1 ,N TC , define
f ( k )
j
= Θ f j
LoA [ r ( k )
: f ( k )
j
]=
.
(18)
j
 
Search WWH ::




Custom Search