Database Reference
In-Depth Information
Center [ 8 ]. To run Hadoop and HDFS, these HPC centers typically partition their
clusters and dedicate sub-parts for exclusive use by Hadoop [ 15 ]. MARLA can
instead operate on existing shared file systems such as NFS or GPFS [ 16 ]. This
feature increases the number of nodes available for MapReduce jobs, removes the
requirement that individual nodes contain significant local storage, and enables
MARLA to support scientific applications that require POSIX compliance.
The MARLA
manages framework I/O. Framework configuration
parameters drive and determine the division of application input into chunks.
Different configuration parameters specify (i) the number of tasks, and (ii) the
number of cores on each worker node. Workers request tasks and receive all
associated input chunk data. To facilitate processing in a heterogeneous envi-
ronment, MARLA allows the user to configure a number of tasks for the data to
be split into. This parameter defines how many data chunks the input should be
divided into, which allows the user to adopt a bag-of-tasks approach to combat-
ing heterogeneity. After the
Splitter
divides the tasks into input data chunks,
it sub-divides those chunks into as many sub-tasks as there are cores on each
worker node, a value defined by a framework parameter. This is done to facilitate
multi-threading on worker nodes. When a worker node requests a task, the file
handle gets passed as an argument, and the file system ensures that the worker
node can access the file.
Hadoop instead splits and replicates data based on block size, and places it
based on node storage capacity, among other factors. Data placement influences
the nodes on which workers complete tasks, often well before the application
runs. Although tasks can migrate from one node to another at the request of
the Master, the system's implicit preference toward local tasks makes it dicult
for Hadoop's straggler mitigation technique to keep up with the non-uniform
processing capability of the cluster nodes when only portions of the cluster have
been upgraded [ 6 , 11 ].
MARLA's
Splitter
,or
, makes the user's
and
reduce
code available to workers, and starts and stops MapReduce jobs. The
TaskController
Master
map
Task
monitors task progress on behalf of worker nodes, and resubmits
failed tasks to the
Controller
monitors tasks for failure,
issuing a “strike” against any node that fails on a task that a worker on another
node successfully completes. Three strikes relegate a worker to a
.The
FaultTracker
FaultTracker
,
blacklist
precluding it from further participation in the job.
Originally, the slowest MapReduce tasks, straggler tasks, limited and deter-
mined the turnaround time of larger MapReduce jobs. Causes of straggler tasks
include less capable node hardware, external load, and variances in input chunk
data, some may require more processing than others. To adapt to these chal-
lenges without making assumptions based on static profiling, MARLA supports
the bag-of-tasks model to combat both static and dynamic heterogeneity.
In this paper we characterize the performance of this bag-of-tasks approach
within a MapReduce framework. We identify beneficial framework configura-
tions for adapting to performance-heterogeneous clusters. Assigning increasing
numbers of tasks per node allows frameworks to divide data and tasks to better
match node capabilities, but invites overhead.
Search WWH ::




Custom Search