A Projected Clustering Algorithm and Its Biomedical Application - Clustering Challenges in Biological Network

Biology Reference

In-Depth Information

Empirical results have shown that IPROCLUS is able to accurately discover

clusters embedded in lower dimensional subspaces. For the synthetic datasets, it

can achieve much higher accuracy than PROCLUS for the scaled datasets while

keeping compatible performance with PROCLUS for the unscaled datasets in all

the three cases. Moreover, IPROCLUS has lower dependence on l than PRO-

CLUS. We also apply our algorithm on the colon tumor dataset, IPROCLUS still

achieves much higher accuracy than PROCLUS.

References

[1] C. C. Aggarwal, C. Procopiuc, J. L. Wolf, P. S. Yu, and J. S. Park. Fast algorithms for

projected clustering. In ACM SIGMOD International Conference on Management of

Data , 1999.

[2] C. C. Aggarwal and P. S. Yu. Finding generalized projected clusters in high dimen-

sional spaces. In ACM SIGMOD International Conference on Management of Data ,

2000.

[3] R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic subspace cluster-

ing of high dimensional data for data mining applications. In ACM SIGMOD Inter-

national Conference on Management of Data , 1998.

[4] U.Alon,N.Barkai,D.A.Notterman,K.Gish,S.Ybarra,D.MackandA.J.Levine.

Broad patterns of gene expression revealed by clustering analysis of tumor and nor-

mal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. U.S.A. ,

96, 6745-6750, 1999, http://microarray.princeton.edu/oncology/

affydata/index.html

[5] K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft. When is “nearest neighbor”

meaningful? ICDT Conference , 1999.

[6] R.O.DudaandP.E.Hart. Pattern classification and scene analysis . John Wiley and

Sons, 1973.

[7]

M. Easter, H. P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discov-

ering clusters in large spatial databases with noise. In Proceedings of the 2nd Inter-

national Conference on Knowledge Discovery and Data Mining , Portland, Oregon,

August 1996.

[8]

T. Gonzalez. Clustering to minimize the maximum intercluster distance. Theoretical

Computer Science , 38: 293-306, 1985.

[9]

S. Guha, R. Rastogi, and K. Shim. CURE: An efficient clustering algorithm for large

databases. In Proceedings of ACM SIGMOD International Conference Management

of Data , 1998.

[10]

T. Ibaraki and N. Katoh. Resource Allocation Problems: Algorithmic Approaches .

MIT Press, Cambridge, Massachusetts, 1988.

[11]

R. Kohavi and D. Sommerfield. Feature subset selection using the wrapper method:

Overfitting and dynamic search space topology. In Proceedings of the 1st Interna-

tional Conference on Knowledge Discovery and Data Mining , 1995.

[12]

R. T. Ng and J. Han. Efficient and effective clustering methods for spatial data mining.

In Proceedings of the 20th International Conference Very Large Data Bases , 1994.

Clustering Challenges in Biological Network

Search WWH ::

Custom Search

Home