Database Reference
In-Depth Information
12.9 CONCLUSION
In this chapter, we presented an overview of a set of approaches and systems that
have presented for developing scalable stream data-processing systems and solu-
tions. Although we have been focusing on the main research and open-source proj-
ects in this domain, we also acknowledge the existence of other commercial systems
and technologies such as Microsoft StreamInsight* and StreamBase. In general, we
notice that although the domain of designing distributed stream processing engine
has attracted the attention of the research community in the last few years, we are
convinced that there is still room for further optimization and advancement in dif-
ferent directions. For example, defining the right and most convenient programming
abstractions and standard declarative interfaces of these systems is an important
research direction that will need to be tackled. Designing innovative frameworks and
mechanisms that can combine the capabilities of large-scale distributed batch pro-
cessing systems (e.g., MapReduce) with the strengths of distributed stream process-
ing engine represents a clear gap in the area of advanced data-processing techniques
of Big Data that has yet to attract sufficient attention from the research community.
REFERENCES
1. Daniel J. Abadi, Yanif Ahmad, Magdalena Balazinska, Ugur Çetintemel, Mitch
Cherniack, Jeong-Hyon Hwang, Wolfgang Lindner, et al. Design of the Borealis Stream
Processing Engine. In CIDR , pages 277-289, 2005.
2. Daniel J. Abadi, Donald Carney, Ugur Çetintemel, Mitch Cherniack, Christian Convey,
Sangdon Lee, Michael Stonebraker, Nesime Tatbul, and Stanley B. Zdonik. Aurora: A
new model and architecture for data stream management. VLDB J ., 12(2):120-139,
2003.
3. Henrique Andrade, Bugra Gedik, Kun-Lung Wu, and Philip S. Yu. Scale-Up Strategies
for Processing High-Rate Data Streams in System S. In ICDE , pages 1375-1378, 2009.
4. Henrique Andrade, Bugra Gedik, Kun-Lung Wu, and Philip S. Yu. Processing high data
rate streams in System S. J. Parallel Distrib. Comput ., 71(2):145-156, 2011.
5. Hari Balakrishnan, M. Frans Kaashoek, David R. Karger, Robert Morris, and Ion Stoica.
Looking up data in p2p systems. Commun. ACM , 46(2):43-48, 2003.
6. Magdalena Balazinska, Hari Balakrishnan, Samuel Madden, and Michael Stonebraker.
Fault-tolerance in the borealis distributed stream processing system. ACM Trans.
Database Syst ., 33(1), 2008.
7. Mitch Cherniack, Hari Balakrishnan, Magdalena Balazinska, Donald Carney, Ugur
Çetintemel, Ying Xing, and Stanley B. Zdonik. Scalable Distributed Stream Processing.
In CIDR , 2003.
8. Jeffrey Dean and Sanjay Ghemawat. MapReduce: Simplified data processing on large
clusters. In OSDI , pages 137-150, 2004.
9. Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati,
Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and
Werner Vogels. Dynamo: Amazon's highly available key-value store. In SOSP, pages
205-220, 2007.
* http://msdn.microsoft.com/en-us/sqlserver/ee476990.aspx.
http://www.streambase.com/.
Search WWH ::




Custom Search