Distributed Programming for the Cloud - Large Scale and Big Data: Processing and Management

Database Reference

In-Depth Information

of nested data. Google has also presented the Pregel system [98], open-sourced by

Apache Giraph and Apache Hama projects, which uses a BSP-based programming

model for efficient and scalable processing of massive graphs on distributed cluster

of commodity machines. Recently, Twitter has announced the release of the Storm*

system as a distributed and fault-tolerant platform for implementing continuous and

real-time processing applications of streamed data. We believe that more of these

domain-specific systems will be introduced in the future to form the new generation

of Big Data systems. Defining the right and most convenient programming abstrac-

tions and declarative interfaces of these domain-specific Big Data systems is another

important research direction that will need to be deeply investigated.

REFERENCES

1. Large synoptic survey. http://www.lsst.org/.

2. Daniel J. Abadi, Adam Marcus, Samuel Madden, and Kate Hollenbach. SW-Store:

A vertically partitioned DBMS for semantic web data management. VLDB Journal ,

18(2):385-406, 2009.

3. Azza Abouzeid, Kamil Bajda-Pawlikowski, Daniel J. Abadi, Alexander Rasin, and Avi

Silberschatz. HadoopDB: An architectural hybrid of MapReduce and DBMS technol-

ogies for analytical workloads. PVLDB , 2(1):922-933, 2009.

4. Azza Abouzied, Kamil Bajda-Pawlikowski, Jiewen Huang, Daniel J. Abadi, and Avi

Silberschatz. HadoopDB in action: Building real world applications. In SIGMOD , 2010.

5. Foto N. Afrati, Anish Das Sarma, David Menestrina, Aditya G. Parameswaran, and

Jeffrey D. Ullman. Fuzzy joins using MapReduce. In ICDE , pp. 498-509, 2012.

6. Foto N. Afrati and Jeffrey D. Ullman. Optimizing joins in a map-reduce environment. In

EDBT , pp. 99-110, 2010.

7. Foto N. Afrati and Jeffrey D. Ullman. Optimizing multiway joins in a map-reduce envi-

ronment. IEEE TKDE , 23(9):1282-1298, 2011.

8. Alexander Alexandrov, Dominic Battré, Stephan Ewen, Max Heimel, Fabian Hueske,

Odej Kao, Volker Markl, Erik Nijkamp, and Daniel Warneke. Massively parallel data

analysis with PACTs on Nephele. PVLDB , 3(2):1625-1628, 2010.

9. Peter Alvaro, Tyson Condie, Neil Conway, Khaled Elmeleegy, Joseph M. Hellerstein,

and Russell Sears. Boom analytics: Exploring data-centric, declarative programming for

the cloud. In EuroSys , pp. 223-236, 2010.

10. Ahmed M. Aly, Asmaa Sallam, Bala M. Gnanasekaran, Long-Van Nguyen-Dinh, Walid

G. Aref, Mourad Ouzzaniy, and Arif Ghafoor. M 3 : Stream processing on main-memory

MapReduce. In ICDE , 2012.

11. Michael Armbrust, Armando Fox, Rean Griffith, Anthony D. Joseph, Randy H. Katz,

Andrew Konwinski, Gunho Lee, David A. Patterson, Ariel Rabkin, Ion Stoica, and

Matei Zaharia. Above the clouds: A Berkeley view of cloud computing, February 2009.

12. Shivnath Babu. Towards automatic optimization of MapReduce programs. In SoCC ,

pp. 137-142, 2010.

13. Andrey Balmin, Tim Kaldewey, and Sandeep Tata. Clydesdale: Structured data process-

ing on hadoop. In SIGMOD Conference , pp. 705-708, 2012.

14. Luiz André Barroso and Urs Hölzle. The case for energy-proportional computing. IEEE

Computer , 40(12):33-37, 2007.

* https://github.com/nathanmarz/storm/.

Large Scale and Big Data: Processing and Management

Search WWH ::

Custom Search

Home