Database Reference
In-Depth Information
db
using the rules above. To explain the idea of the algorithm, next we focus
on how to incrementally compute the 1-itemsets, which means the itemsets of
length one. The remaining itemsets are computed analogously in subsequent
iterations. To compute
L
1
in the updated database
DB
, the FUP algorithm
proceeds as follows:
Scan
db
for all itemsets
X ∈ L
1
, and update their support count
X.s
DB
.
Then, if
X.s
DB
<
minsup
(
D
+
d
),
X
will not be in
L
1
(in the original
FUP algorithm, it is called a loser).
In the same scan, compute the candidate set
C
1
with all the items
X
that
are in
db
but not in
L
1
.If
X.s
db
<
minsup
×
× d
,
X
cannot be a frequent
itemset in the updated database.
Scan the original database
DB
to update the support count for each
X ∈
C
1
. Then, we can generate
L
1
.
For instance, suppose that our example database is updated with the
following transactions (the incremental database
db
):
TransactionId
Items
5000
{
1,2,4
}
6000
{
4
}
The count of each item in
db
is given by
Item
Count
1
1
2
1
4
2
Recall that we require
minsup
= 50%. Let us consider first the frequent
1-itemsets in the original database
DB
, which means the items in
L
1
.These
are
I
1
=
1
(appears three times in the database),
I
2
=
2
(appears twice in
the database), and
I
3
=
3
(also appears twice in the database). The first
step of the FUP algorithm requires a scan of
db
for all itemsets
L
1
and the
computation of their support with respect to
DB
. For each one of these
items, we have
I
1
.s
DB
=4
>
0
.
5
×
6
I
2
.s
DB
=3=0
.
5
×
6
I
3
.s
DB
=2
<
0
.
5
×
6
Therefore, itemset
I
3
will be a loser since it does not verify the support
in the updated database; therefore, it is dropped. On the contrary,
I
1
and
I
2
will be included in
L
1
.
Search WWH ::
Custom Search