Frequent Itemsets - Mining of Massive Datasets

Databases Reference

In-Depth Information

F

Frequent Itemsets in Streams: If we use a decaying window with constant

c, then we can start counting an item whenever we see it in a basket. We

start counting an itemset if we see it contained within the current basket,

and all its immediate proper subsets already are being counted. As the

window is decaying, we multiply all counts by 1−c and eliminate those

that are less than 1/2.

6.7

References for Chapter 6

The market-basket data model, including association rules and the A-Priori

Algorithm, are from [1] and [2].

The PCY Algorithm is from [4]. The Multistage and Multihash Algorithms

are found in [3].

The SON Algorithm is from [5]. Toivonen's Algorithm appears in [6].

1. R. Agrawal, T. Imielinski, and A. Swami, “Mining associations between

sets of items in massive databases,” Proc. ACM SIGMOD Intl. Conf. on

Management of Data, pp. 207-216, 1993.

2. R. Agrawal and R. Srikant, “Fast algorithms for mining association rules,”

Intl. Conf. on Very Large Databases, pp. 487-499, 1994.

3. M. Fang, N. Shivakumar, H. Garcia-Molina, R. Motwani, and J.D. Ull-

man, “Computing iceberg queries e ciently,” Intl. Conf. on Very Large

Databases, pp. 299-310, 1998.

4. J.S. Park, M.-S. Chen, and P.S. Yu, “An effective hash-based algorithm

for mining association rules,” Proc. ACM SIGMOD Intl. Conf. on Man-

agement of Data, pp. 175-186, 1995.

5. A. Savasere, E. Omiecinski, and S.B. Navathe, “An e cient algorithm for

mining association rules in large databases,” Intl. Conf. on Very Large

Databases, pp. 432-444, 1995.

6. H. Toivonen, “Sampling large databases for association rules,” Intl. Conf.

on Very Large Databases, pp. 134-145, 1996.

Search WWH ::

Custom Search

Home