Database Reference
In-Depth Information
their work with idling ones or not. Another open problem is that of mining sub-
patterns in a large object, where sub-patterns can span multiple process' data. Current
methods for sequence motif mining and frequent subgraph mining in a large graph
either rely on maximum pattern length constraints that allow each process to store
overlapping data partition boundaries or transfer data partitions amongst all processes
during each iteration of the algorithm. Neither solution scales when presented with
Big Data, calling for efficient methods to solve this problem exactly.
References
1. Rakesh Agrawal and Ramakrishnan Srikant. Fast algorithms for mining association rules in
large databases. In International Conference on Very Large Data Bases , VLDB '94, pages
487-499, San Francisco, CA, USA, 1994. Morgan Kaufmann Publishers Inc.
2. Rakesh Agrawal and Ramakrishnan Srikant. Mining sequential patterns. In International Con-
ference on Data Engineering , ICDE '95, pages 3-14, Washington, DC, USA, 1995. IEEE
Computer Society.
3. Rakesh Agrawal and John C. Shafer. Parallel mining of association rules. IEEE Transactions
on Knowledge and Data Engineering , 8(6):962-969, 1996.
4. Ramesh C. Agarwal, Charu C. Aggarwal, and V. V. V. Prasad. A tree projection algorithm for
generation of frequent item sets. Journal of Parallel and Distributed Computing , 61(3):350-
371, March 2001.
5. Big data meets big data analytics. http://www.sas.com/resources/whitepaper/wp_46345.pdf.
Accessed: 2014-03-06.
6. Christian Borgelt and Michael R. Berthold. Mining molecular fragments: Finding relevant
substructures of molecules. In IEEE International Conference on Data Mining , ICDM 2002,
pages 51-58. IEEE, 2002.
7. Dhruba Borthakur. The hadoop distributed file system: Architecture and design. Hadoop
Project Website , 11:21, 2007.
8. Gregory Buehrer, Srinivasan Parthasarathy, Anthony Nguyen, Daehyun Kim, Yen-Kuang
Chen, and Pradeep Dubey. Parallel graph mining on shared memory architectures. Technical
report, The Ohio State University, Columbus, OH, USA, 2005.
9. Shengnan Cong, Jiawei Han, Jay Hoeflinger, and David Padua. A sampling-based framework
for parallel data mining. In ACM SIGPLAN Symposium on Principles and Practice of Parallel
Programming , PPoPP '05, pages 255-265, New York, NY, USA, 2005. ACM.
10. Shengnan Cong, Jiawei Han, and David Padua. Parallel mining of closed sequential patterns. In
Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery
in Data Mining , KDD '05, pages 562-567, New York, NY, USA, 2005. ACM.
11. Diane J Cook, Lawrence B Holder, Gehad Galal, and Ron Maglothin. Approaches to parallel
graph-based knowledge discovery. Journal of Parallel and Distributed Computing , 61(3):427-
446, 2001.
12. Brian A. Davey and Hilary A. Priestley. Introduction to lattices and order .
Cambridge
University Press, Cambridge, 1990.
13. Jeffrey Dean and Sanjay Ghemawat. Mapreduce: Simplified data processing on large clusters.
Communications of the ACM , 51(1):107-113, January 2008.
14. Giuseppe Di Fatta and Michael R. Berthold. Dynamic load balancing for the distributed mining
of molecular structures. IEEE Transactions on Parallel and Distributed Systems , 17(8):773-
785, 2006.
15. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. The google file system. In ACM
SIGOPS Operating Systems Review , volume 37, pages 29-43. ACM, 2003.
Search WWH ::




Custom Search