Database Reference
In-Depth Information
is available but is performing poorly, MapReduce runs a speculative copy of its
task (backup task) on another machine to finish the computation faster. Without this
mechanism of speculative execution, a job would be as slow as the misbehaving task.
This situation can arise for many reasons, including faulty hardware and system
misconfiguration. On the other hand, launching too many speculative tasks may
take away resources from useful tasks. Therefore, the accuracy in estimating the
progress and time-remaining long running jobs is an important challenge for a
runtime environment like the MapReduce framework. In particular, this information
can play an important role in improving resource allocation, enhancing the task
scheduling, enabling query debugging or tuning the cluster configuration. The Para-
Timer system [ 184 , 185 ] has been proposed to tackle this challenge. In particular,
ParaTimer provides techniques for handling several challenges including failures
and data skew. To handle unexpected changes in query execution times such as those
due to failures, ParaTimer provides users with a set of time-remaining estimates
that correspond to the predicted query execution times in different scenarios (i.e.,
a single worst-case failure, or data skew at an operator). Each of these indicators
can be annotated with the scenario to which it corresponds, giving users a detailed
picture of possible expected behaviors. To achieve this goal, ParaTimer estimates
time-remaining by breaking queries into pipelines where the time-remaining for
each pipeline is estimated by considering the work to be done and the speed at which
that work will be performed, taking (time-varying) parallelism into account. To get
processing speeds, ParaTimer relies on earlier debug runs of the same query on input
data samples generated by the user. In addition, ParaTimer identifies the critical
path in a query plan where it then estimates progress along that path, effectively
ignoring other paths. Zaharia et al. [ 236 ] have presented an approach to estimate the
progress of MapReduce tasks within environments of clusters with heterogenous
hardware configurations. In these environments, choosing the node on which to run
a speculative task is as important as choosing the task. They proposed an algorithm
for speculative execution called LATE (Longest Approximate Time to End) which
is based on three principles: prioritizing tasks to speculate, selecting fast nodes on
which to run and capping speculative tasks to prevent thrashing. In particular, the
algorithm speculatively execute the task that it suspects will finish farthest into
the future, because this task provides the greatest opportunity for a speculative
copy to overtake the original and reduce the job's response time. To really get the
best chance of beating the original task with the speculative task, the algorithm
only launches speculative tasks on fast nodes (and not the first available node).
The RAFT ( R ecovery A lgorithms for F ast- T racking) system [ 197 , 198 ] has been
introduced, as a part of the Hadoop CC system [ 123 ], for tracking and recovering
MapReduce jobs under task or node failures. In particular, RAFT uses two main
checkpointing mechanisms: local checkpointing and query metadata checkpointing .
On the one hand, the main idea of local checkpointing is to utilize intermediate
results, which are by default persisted by Hadoop, as checkpoints of ongoing task
progress computation. In general, map tasks spill buffered intermediate results to
local disk whenever the output buffer is on the verge to overflow. RAFT exploits
this spilling phase to piggy-back checkpointing metadata on the latest spill of each
Search WWH ::




Custom Search