Performance Analysis of Adapting a MapReduce Framework to Dynamically Accommodate Heterogeneity - Transactions on Large-Scale-Data-and Knowledge-Centered Systems XX

Database Reference

In-Depth Information

Center [ 8 ]. To run Hadoop and HDFS, these HPC centers typically partition their

clusters and dedicate sub-parts for exclusive use by Hadoop [ 15 ]. MARLA can

instead operate on existing shared file systems such as NFS or GPFS [ 16 ]. This

feature increases the number of nodes available for MapReduce jobs, removes the

requirement that individual nodes contain significant local storage, and enables

MARLA to support scientific applications that require POSIX compliance.

The MARLA

manages framework I/O. Framework configuration

parameters drive and determine the division of application input into chunks.

Different configuration parameters specify (i) the number of tasks, and (ii) the

number of cores on each worker node. Workers request tasks and receive all

associated input chunk data. To facilitate processing in a heterogeneous envi-

ronment, MARLA allows the user to configure a number of tasks for the data to

be split into. This parameter defines how many data chunks the input should be

divided into, which allows the user to adopt a bag-of-tasks approach to combat-

ing heterogeneity. After the

Splitter

divides the tasks into input data chunks,

it sub-divides those chunks into as many sub-tasks as there are cores on each

worker node, a value defined by a framework parameter. This is done to facilitate

multi-threading on worker nodes. When a worker node requests a task, the file

handle gets passed as an argument, and the file system ensures that the worker

node can access the file.

Hadoop instead splits and replicates data based on block size, and places it

based on node storage capacity, among other factors. Data placement influences

the nodes on which workers complete tasks, often well before the application

runs. Although tasks can migrate from one node to another at the request of

the Master, the system's implicit preference toward local tasks makes it dicult

for Hadoop's straggler mitigation technique to keep up with the non-uniform

processing capability of the cluster nodes when only portions of the cluster have

been upgraded [ 6 , 11 ].

MARLA's

Splitter

,or

, makes the user's

and

reduce

code available to workers, and starts and stops MapReduce jobs. The

TaskController

Master

map

Task

monitors task progress on behalf of worker nodes, and resubmits

failed tasks to the

Controller

monitors tasks for failure,

issuing a “strike” against any node that fails on a task that a worker on another

node successfully completes. Three strikes relegate a worker to a

.The

FaultTracker

,

blacklist

precluding it from further participation in the job.

Originally, the slowest MapReduce tasks, straggler tasks, limited and deter-

mined the turnaround time of larger MapReduce jobs. Causes of straggler tasks

include less capable node hardware, external load, and variances in input chunk

data, some may require more processing than others. To adapt to these chal-

lenges without making assumptions based on static profiling, MARLA supports

the bag-of-tasks model to combat both static and dynamic heterogeneity.

In this paper we characterize the performance of this bag-of-tasks approach

within a MapReduce framework. We identify beneficial framework configura-

tions for adapting to performance-heterogeneous clusters. Assigning increasing

numbers of tasks per node allows frameworks to divide data and tasks to better

match node capabilities, but invites overhead.

Transactions on Large-Scale-Data-and Knowledge-Centered Systems XX

Search WWH ::

Custom Search

Home