Database Reference
In-Depth Information
Clearly, if the network bandwidth among different machine pairs (
B
i,j
, ∀
i
,
j
<
P
)
is constant, minimizing the total number of cross-partition edges also minimizes the
total network data transfer time.
7.5.3.3 Partitioning the Machine Graph
An observation on multilevel graph partitioning algorithms is that
due to the divide-
and-conquer nature, there is no data exchange between the two bisection subparti-
tions generated from the same bisection.
Suppose a distinct subset of machines is
responsible for each of the two subpartitions. The network connections between the
two subsets of machines are no longer involved in the deeper levels of the bisection.
That means, the partitioning algorithm should pick the high bandwidth connections
remaining in the subset of machines, and leave the low bandwidth connections as
those between the two subsets of machines. This is analogous to performing graph
partitioning on the machine graph with respect to minimizing the total bandwidth
between two subsets of machines. That results in the correspondence between parti-
tioning the data graph and partitioning the machine graph, and the algorithm gradu-
ally assigns the subset of machines that are suitable to handle graph partitioning at
a certain level.
7.5.3.4 Network Performance Aware Partitioning
Putting these together, the algorithm traverses the partition sketches of the machine
graph and the data graph and builds a mapping between the machines and the
partitions. At each level of graph partitioning, the framework partitions the data
graph and machine graph
simultaneously
and matches the network bandwidth in
the cloud to the number of cross-partition edges according to the partition sketch
and the machine graph. The mapping guides the machines where the graph par-
tition is further partitioned, and where the graph partition is stored. At the leaf
level, graph partitions are stored in the machine in the corresponding node in the
machine graph. Finally, the partition sketches for both machine graph and data
graph are generated.
Example
Figure 7.2 illustrates the mapping between two machine graphs and a data graph
for the partitioning framework. Take case (a) where four machines are selected as
an example. The bisection on the entire graph
G
is done on all the four machines.
At the next level, the bisections on
G
1
and
G
2
are performed on pods
M
1
and
M
2
,
respectively. Finally, the partitions are stored in the machines according to the
mapping.
Regarding the local graph partitioning algorithm, any classical graph partition-
ing algorithms such as Metis [47] can be used. For example, Metis can be used to
partition the machine graph, since the machine graph can often fit into the main
memory of a single machine. On the bisection of the machine graph, the objective
function is to minimize the weight of the cross-partition edges with the constraint
of two partitions having around the same number of machines. This objective
function matches the bandwidth unevenness of the selected machines. The goal
Search WWH ::
Custom Search