Performance Analysis of Adapting a MapReduce Framework to Dynamically Accommodate Heterogeneity - Transactions on Large-Scale-Data-and Knowledge-Centered Systems XX

Database Reference

In-Depth Information

4 Experimental Setup and Overview

Our experiments run on the Binghamton University Grid and Cloud Computing

Research Laboratory experimental research cluster, which comprises the follow-

ing components:

-1 Master node running a 4 core Intel Xeon 5150 @ 2.66 GHz and 8 GB RAM

-24 Baseline nodes - 4 core Intel Xeon 5150 @ 2.66 GHz and 8 GB RAM

-24 Faster nodes - 8 core Intel Xeon E545 @ 2.33 GHz and 8 GB RAM

-12 Fastest nodes - 32 core Intel Xeon E5-2670 @ 2.60 GHz and 126 GB RAM

Each node runs 64-bit Linux 2.6.32 and shares an NFS server. To emulate clus-

ters that evolve as described by Nathuji et al. [ 14 ], who report that data centers

perform partial upgrades of their compute and storage infrastructures approx-

imately every two years, we model incremental upgrades by enabling different

portions of the cluster containing different combinations of the three classes of

machines.

We do not include performance data for Hadoop as it does not support

deferred binding of tasks. In our earlier work, we compared Hadoop with our

MARLA framework for load imbalanced and fault-tolerance scenarios [ 9 ].

The comparison shows that MARLA and Hadoop had a similar performance

profile for processing floating point data in a homogeneous cluster. However,

in 75-node cluster with 600 cores, in which 75 % of the nodes have third-party

CPU and memory loads, MARLA takes 33 % less time than Hadoop to process

300 million matrices. For the widely used MapReduce benchmark of processing a

0.6 TB file for word frequency count, Hadoop and MARLA were tested for fault

tolerance. In this test, a 32-node cluster progressively lost 6, 8, 10, 12, 14, and

16 nodes. The results showed that MARLA consistently performed better than

Hadoop when faced with loss of nodes.

In this paper, our experiments multiply matrices containing random floating

point values. The CPU-intensity of matrix multiplication emulates the character-

istics and requirements of many Big Data applications. The differences between

Baseline, Faster, and Fastest nodes lie primarily in processor speeds and the

number of cores; therefore, CPU-intensive applications highlight this difference

most effectively. We report (i) the average time for ten runs of each experiment,

and (ii) the number of 33

33 matrices that are multiplied.

We design and run experiments on a cluster that utilizes a centralized file

system(NFS). We limit the scope of this paper to the realm of NFS for two

reasons. The first is based on our prior work MARIANE [ 15 ], in which we dis-

cuss how it is often the case that HPC environments are unable to utilize the

MapReduce paradigm because of the burdens imposed by HDFS. The MARLA

framework utilizes the same code-base as MARIANE as it was also designed with

such HPC environments in mind. A comparison of how the use of HDFS has an

effect on the performance of a MapReduce framework in such an environment

was previously considered in [ 15 ] and is omitted here due to space constraints.

The second reason we restrict our experiments to use of a centralized data store

is because of evidence that suggests that many companies, like Facebook, use

×

Transactions on Large-Scale-Data-and Knowledge-Centered Systems XX

Search WWH ::

Custom Search

Home