Database Reference
In-Depth Information
Table 6.1 Transactional
database-positive and
negative items
TID
Original TD
Augmented TD
1
A,C,D
A ,
¬
B , C , D ,
¬
E
2
B,C
¬
A , B , C ,
¬
D ,
¬
E
3
C
¬
A ,
¬
B , C ,
¬
D ,
¬
E
4
A,B,E
A , B ,
¬
C ,
¬
D , E
5
A,C,D
A ,
¬
B , C , D.
¬
E
Definition 1 (Association Rule) An association rule is an implication of the form
X
I
I
=∅
Y ”, where X
, Y
, and X
Y
.
Definition 2
(Support) The rule X
Y has a support s in the transaction set
D
if
s % of the transactions in
Y . In other words, the support of the rule is
the probability that X and Y hold together among all the possible presented cases.
D
contain X
Definition 3
(Confidence) The rule X
Y holds in the transaction set
D
with
confidence c if c % of transactions in
that contain X also contain Y . In other words,
the confidence of the rule is the conditional probability that the consequent Y is true
under the condition of the antecedent X .
The problem of discovering all association rules from a set of transactions
D
D
consists of generating the rules that have a support and confidence greater than given
thresholds.
Definition 4
(Negative Item and Positive Item) A negative item is defined as
¬
i k , meaning that item i k is absent from a transaction T . The support of
¬
i k is
s (
s ( i k ). i k , a positive item, is an item that is present in a transaction.
Definition 5 (Negative Association Rule) A negative association rule is an impli-
cation of the form X
¬
i k )
=
1
Y , where X
I
, Y
I
, and X
Y
=∅
and X and/or Y
contain at least one negative item.
Definition 6 Negative Associations between Itemsets A negative association be-
tween two positive itemsets X , Y are rules of the following forms
¬
X
Y , X
⇒¬
Y
and
Y .
Table 6.1 shows a toy transactional database with 5 transactions and 5 items. “Orig-
inal TD” column shows the items present in each transaction, while “Augmented TD”
column shows both present and absent items.
Mining association rules from a transactional database that contains information
about both present and absent items is computationally expensive due to the following
reasons:
¬
X
⇒¬
1. The number of items in the transactional database swells when their negative
counterparts are added to a transactional database. The maximum number of
patterns that can be found in a transactional database with d items is 2 d
1. The
number of items in the “Original TD” in Table 6.1 is n
5. Even for the small set
in Table 6.1 , the number of itemsets jumps dramatically from 31 to 1023 when
the negative items are added.
=
 
Search WWH ::




Custom Search