Database Reference
In-Depth Information
application needs, such as data consistency and transaction management, have
received relatively little research attention. These needs are already difficult prob-
lems in the context of flat data-like distributed relational databases [8,21], and they
will be more challenging for graph systems.
7.8.3 C omPutation m oDel
“One size does not fit all.” Application needs drive the computation model. Current
systems are mainly based on MapReduce and the vertex oriented execution model.
However, it is an open problem to extend these models with indexes and different
application needs such as consistency and transaction management.
7.8.4 C ost oF o wnershiP
Ideally, users want to minimize the cost of ownership while satisfying the performance
requirement and other quality of service attributes. However, the design space is huge
for various different hardware and software components. As specific to the cloud, dif-
ferent cloud providers offer very different price structures. Even for the same cloud
provider, the capabilities of virtual machines can be quite different [27]. More research
has to be conducted on automatic and customizable design for the cost of ownership.
7.9 SUMMARY
In this chapter, we have surveyed a number of applications of large graphs and exist-
ing representative cloud-based large graph processing systems. One of the classic
techniques for handling large graphs is graph partitioning. The chapter reviewed the
network unevenness of the cloud, which poses new challenges to graph partitioning
techniques. In particular, networks with high bandwidth between machines can pro-
cess more tasks on cross-partition edges. This chapter then focused on network per-
formance aware graph partitioning. The techniques include modeling machines and
the network bandwidth between them as a machine graph, and partitioning the graph
corresponding to the machine graph. These techniques minimize network traffic in
both partitioning and processing. The processing on partition graphs may further
exploit the locality of the partitions to reduce communications. There are many open
problems that require more research efforts in this field.
REFERENCES
1. A new application award: Semantic web challenge. http://challenge.semanticweb.org/,
2013.
2. A. Abouzeid, K. Bajda-Pawlikowski, D. Abadi, A. Silberschatz, and A. Rasin.
Hadoopdb: An architectural hybrid of MapReduce and DBMS technologies for analyti-
cal workloads. Proc. VLDB Endow. , 2009.
3. C. C. Aggarwal, Y. Zhao, and P. S. Yu. A framework for clustering massive graph
streams: Submission to best of SDM 2010 issue. Stat. Anal. Data Min. , 3(6):399-416,
December 2010.
4. Apache Giraph. http://giraph.apache.org/.
Search WWH ::




Custom Search