Database Reference
In-Depth Information
4 Experimental Setup and Overview
Our experiments run on the Binghamton University Grid and Cloud Computing
Research Laboratory experimental research cluster, which comprises the follow-
ing components:
-1 Master node running a 4 core Intel Xeon 5150 @ 2.66 GHz and 8 GB RAM
-24 Baseline nodes - 4 core Intel Xeon 5150 @ 2.66 GHz and 8 GB RAM
-24 Faster nodes - 8 core Intel Xeon E545 @ 2.33 GHz and 8 GB RAM
-12 Fastest nodes - 32 core Intel Xeon E5-2670 @ 2.60 GHz and 126 GB RAM
Each node runs 64-bit Linux 2.6.32 and shares an NFS server. To emulate clus-
ters that evolve as described by Nathuji et al. [ 14 ], who report that data centers
perform partial upgrades of their compute and storage infrastructures approx-
imately every two years, we model incremental upgrades by enabling different
portions of the cluster containing different combinations of the three classes of
machines.
We do not include performance data for Hadoop as it does not support
deferred binding of tasks. In our earlier work, we compared Hadoop with our
MARLA framework for load imbalanced and fault-tolerance scenarios [ 9 ].
The comparison shows that MARLA and Hadoop had a similar performance
profile for processing floating point data in a homogeneous cluster. However,
in 75-node cluster with 600 cores, in which 75 % of the nodes have third-party
CPU and memory loads, MARLA takes 33 % less time than Hadoop to process
300 million matrices. For the widely used MapReduce benchmark of processing a
0.6 TB file for word frequency count, Hadoop and MARLA were tested for fault
tolerance. In this test, a 32-node cluster progressively lost 6, 8, 10, 12, 14, and
16 nodes. The results showed that MARLA consistently performed better than
Hadoop when faced with loss of nodes.
In this paper, our experiments multiply matrices containing random floating
point values. The CPU-intensity of matrix multiplication emulates the character-
istics and requirements of many Big Data applications. The differences between
Baseline, Faster, and Fastest nodes lie primarily in processor speeds and the
number of cores; therefore, CPU-intensive applications highlight this difference
most effectively. We report (i) the average time for ten runs of each experiment,
and (ii) the number of 33
33 matrices that are multiplied.
We design and run experiments on a cluster that utilizes a centralized file
system(NFS). We limit the scope of this paper to the realm of NFS for two
reasons. The first is based on our prior work MARIANE [ 15 ], in which we dis-
cuss how it is often the case that HPC environments are unable to utilize the
MapReduce paradigm because of the burdens imposed by HDFS. The MARLA
framework utilizes the same code-base as MARIANE as it was also designed with
such HPC environments in mind. A comparison of how the use of HDFS has an
effect on the performance of a MapReduce framework in such an environment
was previously considered in [ 15 ] and is omitted here due to space constraints.
The second reason we restrict our experiments to use of a centralized data store
is because of evidence that suggests that many companies, like Facebook, use
×
Search WWH ::




Custom Search