Databases Reference
In-Depth Information
Fig. 4.
Pseudocode of a priori Algorithm
where:
Join
(
{
i
1
,...,i
k
}
,L
k−
1
)
≡{
i
1
,..,i
k−
2
,i
k−
1
}∈
L
k−
1
(3)
∧{
i
1
,..,i
k−
2
,i
k
}∈
L
k−
1
,
Prune
(
c,L
k−
1
)
≡∀
s
[
s
⊂
c
]
∧|
s
|
=
k
−
1
→
s
∈
L
k−
1
.
(4)
Observe that the Join step (3) takes two (
k
-1)-itemsets of a same equiva-
lence class to generate a
k
-itemsets.
The a priori algorithm is presented in Fig. 4.
The procedure a priori gen used in step 3 is described in (2).
3 Related Work
The vertical binary representations (VTV) and the corresponding support
counting method have been investigated by other researchers [7, 8, 10, 12, 22].
Zaki et al. proposed several algorithms using vertical binary representa-
tions in 1997 [22]. Their improvements are obtained clustering the database
and applying an a priori-like method with simple tid-vectors.
Gardarin et al. proposed two breadth-first search algorithms using vertical
binary representations named N-BM and H-BM in 1998 [8]. N-BM consid-
ers simple (uncompressed) vertical binary representations for itemsets. Mean-
while, H-BM uses, besides, an auxiliary bit-vector, where each bit represents
a group of bits of the original bit-vector. In order to save the memory, every
1-itemset has both, while every large itemset keeps only the auxiliary bit-
vector. H-BM first performs the AND between auxiliary bit-vectors and only
non-zero groups are considered to the final count. However, as in large itemsets