Databases Reference
In-Depth Information
3 Preliminaries
As noted in Sect. 1.1,
represents a com-
plete set of possible CARs that are generated from
D
TR
,and
R
j
represents a
rule in set
R
=
{
R
1
,R
2
,...,R
2
n
−n−
2
,R
2
n
−n−
1
}
R
with label
j
.
3.1 Proposed Rule Weighting Scheme
Item Weighting Score
There are
n
items involved in
D
TR
. For a particular pre-defined class
A
(as
c
i
∈ C
), a score is assigned to each item in
D
TR
that distinguishes the signif-
icant items for class
A
from the insignificant ones.
Definition 1.
Let c
A
(
Item
h
)
denote the contribution of each item
h
∈
D
TR
for class A, which represents how significantly item
h
determines A, where
0
≤
c
A
(
Item
h
)
≤|
C
|
,and
|
C
|
is the size function of the set C.
The calculation of
c
A
(
Item
h
) is given as follows:
TransFreq
(
Item
h
, A
))
c
A
(
Item
h
)=(
TransFreq
(
Item
h
,A
))
×
(1
−
|C|
ClassCount
(
Item
h
,C
)
×
,
where
1. The
TransFreq
(
Item
h
,Aor A
) function computes how frequently that
Item
h
appears in class
A
or the group of classes
A
(the complement of
A
).
The calculation of this function is:
number of transactions with Item
h
in the class
(
es
)
number of transactions in the class
(
es
)
.
2. The
ClassCount
(
Item
h
,C
) function simply counts the number of classes
in
C
which contain
Item
h
.
The rationale of this item weighting score is demonstrated as follows:
1. The weighting score of
Item
h
for class
A
tends to be high if
Item
h
is
frequent in
A
.
2. The weighting score of
Item
h
for class
A
tends to be high if
Item
h
is
infrequent in
A
.
3. The weighting score of
Item
h
for any class tends to be high if
Item
h
is
involved in a small number of classes in
C
. In [5], a similar idea can be
found in feature selection for text categorisation.
Rule Weighting Score
Based on the item weighting score, a weighting score is assigned to the rule
antecedent of each
R
j
∈R
.