Performance Analysis of Adapting a MapReduce Framework to Dynamically Accommodate Heterogeneity - Transactions on Large-Scale-Data-and Knowledge-Centered Systems XX

Database Reference

In-Depth Information

astroinformatics [ 4 ], by splitting data across processing nodes, applying the same

operation on each subset, and aggregating results.

When frameworks split data evenly across nodes, and when the

and

map

functions are applied uniformly, the frameworks implicitly assume that

constituent nodes possess similar processing capability. When they do not, strag-

gler processes result and performance suffers [ 5 , 6 ].

We refer to clusters whose nodes exhibit non-uniform processing capability as

being performance-heterogeneous . Performance heterogeneity can result from data

center administrators upgrading subsets of nodes incrementally, rather than

replacing or upgrading all cluster nodes at once. This can result as funds become

available incrementally, as older nodes fail or become obsolete, and as new faster

processors continue to emerge. FutureGrid [ 7 ]andNERSC[ 8 ] exemplify perfor-

mance-heterogeneous clusters. The FutureGrid test-bed is a geographically dis-

tributed set of heterogeneous nodes that vary significantly in terms of processor

speeds, number of cores, available memory, and storage technologies. NERSC's

Carver cluster includes a mix of Intel Nehalem quad-core processors, Westmere

6-core processors, and Nehalem-EX 8-core processors, for a total of 9,984 cores.

Hadoop [ 1 ], the de facto standard MapReduce framework, can perform poorly

in performance-heterogeneous environments [ 5 , 6 , 9 , 10 ]. To improve performance,

MapReduce applications, in concert with supporting frameworks, must consider

differences in processing capabilities of underlying nodes. Simply put, faster nodes

should perform more work in the same time, eliminating or greatly reducing the

need for applications to wait for straggler processes to finish [ 6 , 11 ]. Our MARLA

MapReduce framework [ 9 ] supports partitioning of labor into sub-tasks, and does

not rely on the Hadoop Distributed File System (HDFS) [ 12 ]. Instead, it uses a

standard implementation of Network File System (NFS); therefore, data need not

reside on worker nodes before a MapReduce application runs, and more capa-

ble nodes can eventually receive and process more data. MARLA therefore does

not require significant local storage space on worker nodes, but does require data

movement (via NFS or some other underlying file system) at runtime.

In this paper, we configure a cluster to exhibit varying degrees of performance-

heterogeneity, and test the effectiveness of splitting MapReduce applications with

several degrees of granularity. Using smaller sub-tasks increases the opportunity

to react to performance-heterogeneity, but also requires that the application

pause more often to wait for data to arrive. Our experiments help identify the

circumstances under which the benefits of fine-grained subtasking and delayed

data partitioning outweigh the associated costs. We vary cluster nodes to include

two and three different levels of processing capability, and configure different per-

centages of nodes at each level. For each cluster environment, we divide appli-

cation labor into different granularities of subtasks, to help identify the best

strategy for task distribution on clusters with different characteristics.

This paper makes the following contributions:

reduce

- It demonstrates how incremental upgrades of a cluster can affect perfor-

mance of MapReduce applications that do not respond to cluster performance-

heterogeneity. Application developers do not typically reap the performance

improvements that cluster providers purportedly pay for.

Transactions on Large-Scale-Data-and Knowledge-Centered Systems XX

Search WWH ::

Custom Search

Home