Clustering - Mining of Massive Datasets

Databases Reference

In-Depth Information

7.8

References for Chapter 7

The ancestral study of clustering for large-scale data is the BIRCH Algorithm

of [6]. The BFR Algorithm is from [2]. The CURE Algorithm is found in [5].

The paper on the GRGPF Algorithm is [3].

The necessary background

regarding B-trees and R-trees can be found in [4].

The study of clustering on

streams is taken from [1].

1. B. Babcock, M. Datar, R. Motwani, and L. O'Callaghan, “Maintaining

variance and k-medians over data stream windows,” Proc. ACM Symp.

on Principles of Database Systems, pp. 234-243, 2003.

2. P.S. Bradley, U.M. Fayyad, and C. Reina, “Scaling clustering algorithms

to large databases,” Proc. Knowledge Discovery and Data Mining, pp. 9-

15, 1998.

3. V. Ganti, R. Ramakrishnan, J. Gehrke, A.L. Powell, and J.C. French:,

“Clustering large datasets in arbitrary metric spaces,” Proc. Intl. Conf.

on Data Engineering, pp. 502-511, 1999.

4. H. Garcia-Molina, J.D. Ullman, and J. Widom, Database Systems: The

Complete Book Second Edition, Prentice-Hall, Upper Saddle River, NJ,

2009.

5. S. Guha, R. Rastogi, and K. Shim, “CURE: An e cient clustering algo-

rithm for large databases,” Proc. ACM SIGMOD Intl. Conf. on Manage-

ment of Data, pp. 73-84, 1998.

6. T. Zhang, R. Ramakrishnan, and M. Livny, “BIRCH: an e cient data

clustering method for very large databases,” Proc. ACM SIGMOD Intl.

Conf. on Management of Data, pp. 103-114, 1996.

Search WWH ::

Custom Search

Home