Database Reference
In-Depth Information
and communication overhead. Trinity exploits the memory of the machines in the
cloud forming a “memory cloud,” which enables fast random data access, which is
particularly useful for computation on graphs. In addition, Trinity consists of a native
graph storage engine. These techniques significantly speed up large graph process-
ing. Trinity supports both transactional and batched graph processing.
7.3.1.6 GraphLab
GraphLab [56] is specially designed for machine learning and data mining algorithms,
which are not naturally supported by MapReduce. The GraphLab abstraction enables
developers to specify asynchronous, dynamic, graph-parallel computation while ensur-
ing data consistency and achieving a high degree of parallel performance in the shared-
memory setting. GraphLab uses an asynchronous parallel model different from the BSP
model used by Pregel. Additionally, The GraphLab framework has been extended to the
distributed setting while preserving strong data consistency guarantees [55].
Other cloud-based solutions for graph processing include the following. DisG [81]
is an ongoing project for web graph reconstruction using Hadoop. Pujol et al. [65]
studied different replication methods to scale social network analysis. Hama [5] and
Giraph [4] are two open-source projects targeting large graph processing. They adopt
Pregel's programming model and their storage is built on top of the Hadoop Distributed
File System. While the solutions mentioned above focus on batch processing, there are
transactional graph processing databases such as Neo4j and InfiniteGraph. Finally,
recently, a number of cloud-based data management systems have been developed for
other important workloads such as data warehousing [2,35,77] and on-line transaction
processing [22], which are beyond the scope of this chapter.
7.3.2 C omParison oF e Xisting s ystems
Table 7.1 provides a brief comparison of a number of representative graph processing
systems with respect to their properties of graph storage, support of online process-
ing, main-memory processing and distributed processing. Neo4j and HyperGraphDB
TABLE 7.1
Comparison of Representative Systems (An Extended Version Based on
Table 2 in Previous [68])
Online Query
Processing
Memory-Based
Exploration
Distributed Parallel
Processing
Native Graphs
Neo4j
Ye s
Ye s
No
No
HyperGraphDB
No
Ye s
No
No
InfiniteGraph
Ye s
Ye s
No
Ye s
MapReduce
No
No
No
Ye s
PEGASUS
No
No
No
Ye s
Surfer
Ye s
No
Ye s
Ye s
Googles Pregel
No
No
No
Ye s
Microsofts Trinity
Ye s
Ye s
Ye s
Ye s
 
Search WWH ::




Custom Search