Network Performance Aware Graph Partitioning for Large Graph Processing Systems in the Cloud - Large Scale and Big Data: Processing and Management

Database Reference

In-Depth Information

Beyond machine-level parallelism, it is desirable to exploit intra-machine parallelism.

On multicore CPUs, parallel libraries like MTGL [12] have been developed for par-

allel graph algorithms. MTGL offers a set of data structures and APIs for building

graph algorithms. The MTGL API is modeled after the Boost Graph Library [69] and

optimized to leverage shared memory multithreaded machines. The SNAP framework

[7] provides a set of algorithms and building blocks for graph analysis, especially for

small-world graphs. On the GPU, a general-purpose programming framework called

Medusa [80] has been developed. The goal is to hide the details of graph programming

and GPU runtime from users. In contrast to Pregel, Medusa adopts very fine-grained

processing on vertices/edges/messages to exploit the massive parallelism of the GPU.

Additionally, there are specific parallel graph algorithms on the GPU [34,36,48,75].

7.4 UNEVEN BANDWIDTH BETWEEN THE

MACHINES OF THE CLOUD

The cloud-based solutions discussed in the previous section provide a user-friendly plat-

form for users to develop their custom logic without worrying how the underlying inter-

connected machines operates. However, the unique network environment that consists

such number of servers does further add fuel to the challenges of large graph processing.

In this section, we discuss the factors on the cloud (such as hardware and software) that

reveal the major factors of network bandwidth unevenness in the cloud.

7.4.1 F aCtor 1: n etwork e nvironment

Due to the significant scale, the cloud network environment is significantly different

from those in previous distributed environment [44,46,52], for example, Cray super-

computers or a small-scale cluster. In a small-scale cluster, the network bandwidth is

often roughly the same for every machine pair. However, the network bandwidth of

the cloud environment is uneven among different machine pairs.

Current cloud infrastructures often use a switch-based tree structure to intercon-

nect the servers [10,32,41]. Machines are first grouped into pods , and then pods are

connected to higher-level switches. A natural consequence of such a topology is

that the network bandwidth of any machine pair is not uniform that is influenced by

the switches that connect the two machines [37]. The intra-pod bandwidth is much

higher than the cross-pod bandwidth.

The knowledge of network topology (such as multilevel data reduction along

the tree topology [23] and partition-based locality optimizations [64]) and schedul-

ing techniques [38] are crucial for advanced optimization in the cloud. However, it

should also be remarked that the topology information in the cloud is usually not

available to cloud users due to the virtualization and system management issues.

Finally, a simple reason for network unevenness can be that the commodity com-

puters in the cloud may not have a uniform network configuration (e.g., network

adaptors). As the cloud evolves, its computers may become heterogeneous from gen-

erations to generations [79]. For example, current mainstream network adaptors pro-

vide 1 Gb/sec, and the adaptors with 10 Gb/sec has been gradually employed. These

Search WWH ::

Custom Search

Home