Database Reference
In-Depth Information
7.7.5 D istributeD g raPh P artitioning a lgorithms
Prior to the network bandwidth aware framework described in the previous sec-
tions, distributed graph partitioning [24,47,52] was the traditional way of reducing
data shuffling in distributed graph processing. The commonly used distributed graph
processing algorithms are multilevel algorithms [44,46,73], which are also used in
the partitioning algorithm described in this chapter. They have been proved efficient
in many applications.
7.7.6 t he m etis F ramework
A highly popular tool Metis [47] is a multilevel graph partitioning framework whose
implementation is fast, robust, and easy to use. The multilevel graph partitioning
framework contains three phases: (1) “coarsening by maximal match until the graph
is small enough”; (2) partitioning the coarsest graph by any reasonable partition
algorithm; and (3) refining the partitions by vertex swapping algorithm. ParMetis
[44,46] is a parallel multilevel graph partitioning algorithm, with a minimum bisec-
tion on each level. It has been demonstrated to perform very well on shared-memory
architectures [46]. Additionally, various different heuristics have been proposed for
the quality of coarsening and refinement (e.g., [9,59]).
7.8 OPEN PROBLEMS
Despite recent efforts in large-graph processing in the cloud, many open problems
remain to be explored in future [68]. As suggested by Shao et al., these open prob-
lems include architectural design, application needs, computation model, and owner-
ship. We briefly elaborate on the problems as follows.
7.8.1 a rChiteCtural D esign
Currently, in-memory processing is the key technique for resolving the random I/O
in graph processing (besides algorithmic design). However, as the increasing popu-
larity of graph-centric applications (such as the fast growing social graphs and web
graph), it is yet to confirm whether the in-memory solution is the most favorable
system design in terms of performance, energy consumption, and total ownership
cost, among other things. Thus, in addition to main-memory based solutions, one
may investigate other emerging storages such as solid state drives (SSD), which also
exhibits much faster random I/O speed than hard disks. A hybrid storage system of
SSD and main memory may also be possible for increasingly large graphs. More
research work is required on efficient data structures and algorithms for graphs on
such a hybrid system.
7.8.2 a PPliCation n eeDs
Web and social networks have been the two main driving applications for graph
processing. Their application needs evolve from offline to online processing. Many
Search WWH ::




Custom Search