Databases Reference
In-Depth Information
3 Preliminaries
As noted in Sect. 1.1,
represents a com-
plete set of possible CARs that are generated from D TR ,and R j represents a
rule in set
R
=
{
R 1 ,R 2 ,...,R 2 n −n− 2 ,R 2 n −n− 1 }
R
with label j .
3.1 Proposed Rule Weighting Scheme
Item Weighting Score
There are n items involved in D TR . For a particular pre-defined class A (as
c i ∈ C ), a score is assigned to each item in D TR that distinguishes the signif-
icant items for class A from the insignificant ones.
Definition 1. Let c A ( Item h ) denote the contribution of each item h
D TR
for class A, which represents how significantly item h determines A, where 0
c A ( Item h )
≤|
C
|
,and
|
C
|
is the size function of the set C.
The calculation of c A ( Item h ) is given as follows:
TransFreq ( Item h , A ))
c A ( Item h )=( TransFreq ( Item h ,A ))
×
(1
|C|
ClassCount ( Item h ,C )
×
,
where
1. The TransFreq ( Item h ,Aor A ) function computes how frequently that
Item h appears in class A or the group of classes
A (the complement of A ).
The calculation of this function is:
number of transactions with Item h in the class ( es )
number of transactions in the class ( es ) .
2. The ClassCount ( Item h ,C ) function simply counts the number of classes
in C which contain Item h .
The rationale of this item weighting score is demonstrated as follows:
1. The weighting score of Item h for class A tends to be high if Item h is
frequent in A .
2. The weighting score of Item h for class A tends to be high if Item h is
infrequent in A .
3. The weighting score of Item h for any class tends to be high if Item h is
involved in a small number of classes in C . In [5], a similar idea can be
found in feature selection for text categorisation.
Rule Weighting Score
Based on the item weighting score, a weighting score is assigned to the rule
antecedent of each R j ∈R .
 
Search WWH ::




Custom Search