Database Reference
In-Depth Information
db using the rules above. To explain the idea of the algorithm, next we focus
on how to incrementally compute the 1-itemsets, which means the itemsets of
length one. The remaining itemsets are computed analogously in subsequent
iterations. To compute L 1 in the updated database DB , the FUP algorithm
proceeds as follows:
￿ Scan db for all itemsets X ∈ L 1 , and update their support count X.s DB .
Then, if X.s DB < minsup
( D + d ), X will not be in L 1 (in the original
FUP algorithm, it is called a loser).
￿ In the same scan, compute the candidate set C 1 with all the items X that
are in db but not in L 1 .If X.s db < minsup
×
× d , X cannot be a frequent
itemset in the updated database.
￿ Scan the original database DB to update the support count for each X ∈
C 1 . Then, we can generate L 1 .
For instance, suppose that our example database is updated with the
following transactions (the incremental database db ):
TransactionId
Items
5000
{ 1,2,4 }
6000
{ 4 }
The count of each item in db is given by
Item
Count
1
1
2
1
4
2
Recall that we require minsup = 50%. Let us consider first the frequent
1-itemsets in the original database DB , which means the items in L 1 .These
are I 1 = 1 (appears three times in the database), I 2 = 2 (appears twice in
the database), and I 3 = 3 (also appears twice in the database). The first
step of the FUP algorithm requires a scan of db for all itemsets L 1 and the
computation of their support with respect to DB . For each one of these
items, we have
I 1 .s DB =4 > 0 . 5
×
6
I 2 .s DB =3=0 . 5
×
6
I 3 .s DB =2 < 0 . 5
×
6
Therefore, itemset I 3 will be a loser since it does not verify the support
in the updated database; therefore, it is dropped. On the contrary, I 1 and I 2
will be included in L 1 .
 
Search WWH ::




Custom Search