Database Reference
In-Depth Information
hardware factors result in the unevenness of the network bandwidth among machines
in the cloud.
7.4.1.1 Case Study
As discussed, the network bandwidth among different machine pairs can vary
significantly. Such network bandwidth unevenness has been observed by cloud
providers [10,41]. He et al. [19] have also observed significant network bandwidth
unevenness in Amazon EC2. Figure 7.1 shows the network bandwidth of every
machine pair among 64 and 128 small instances (i.e., virtual machine) on Amazon
EC2. The network bandwidth varies significantly. The mean (MB/sec) and standard
deviation are (112.8, 37.5) and (115.0, 40.2) for 64 and 128 small instances, respectively.
It is observed that some pairwise bandwidth are very high (e.g., more than 500 MB/
sec). The possible reason is that those small instances can be allocated to the same
physical machine.
He et al. [19] also note that the network bandwidth between two instances in the
public cloud is temporarily stable, with similar results observed in the another study
[76]. This allows to develop network performance aware optimizations based on the
network bandwidths measured at a particular recent time point.
7.4.2 F aCtor 2: v irtualization
In addition to hardware factors, software techniques in the cloud can result in net-
work bandwidth unevenness. In particular, virtualization has been a crucial facility
of the cloud. It hides the network topology or the real configurations of the machines
underneath a cloud system from users. In fact, in cloud environments, users do
not have administrator privileges on the hardware under the virtualization layer.
A popular optimization in virtualization is virtual machine consolidation, for bet-
ter resource utilization of virtualization. However, the consolidation process may
induce concurrent tasks to compete for the network bandwidth on the same physical
machine. Different degrees of consolidation cause the network bandwidth uneven-
ness among physical machines.
7.5 NETWORK BANDWIDTH AWARE GRAPH
PARTITIONING TECHNIQUE FOR THE CLOUD
Due to the massive volume of graph data, even a baseline graph processing engine
should store a large graph into partitions, as opposed to a single flat storage. However,
graph partitioning itself should be effectively integrated into the large processing in
the cloud environment. There are a number of challenging issues in such an integra-
tion. First, graph partitioning itself is a very costly task, which in particular gener-
ates much network traffic. Second, the network bandwidth unevenness described
in Section 1.3 affects the way of graph partitioning and graph partition storage on
the machines. Since the number of graph partitions and the number of machines for
graph processing can be very large, the possible solution space of storing graph parti-
tions to the machines is huge. Consider P partitions to be stored on P machines. The
space includes P ! possible solutions. Another problem is how to make both the graph
Search WWH ::




Custom Search