Big Data Processing Systems - Cloud Data Management

Database Reference

In-Depth Information

is available but is performing poorly, MapReduce runs a speculative copy of its

task (backup task) on another machine to finish the computation faster. Without this

mechanism of speculative execution, a job would be as slow as the misbehaving task.

This situation can arise for many reasons, including faulty hardware and system

misconfiguration. On the other hand, launching too many speculative tasks may

take away resources from useful tasks. Therefore, the accuracy in estimating the

progress and time-remaining long running jobs is an important challenge for a

runtime environment like the MapReduce framework. In particular, this information

can play an important role in improving resource allocation, enhancing the task

scheduling, enabling query debugging or tuning the cluster configuration. The Para-

Timer system [ 184 , 185 ] has been proposed to tackle this challenge. In particular,

ParaTimer provides techniques for handling several challenges including failures

and data skew. To handle unexpected changes in query execution times such as those

due to failures, ParaTimer provides users with a set of time-remaining estimates

that correspond to the predicted query execution times in different scenarios (i.e.,

a single worst-case failure, or data skew at an operator). Each of these indicators

can be annotated with the scenario to which it corresponds, giving users a detailed

picture of possible expected behaviors. To achieve this goal, ParaTimer estimates

time-remaining by breaking queries into pipelines where the time-remaining for

each pipeline is estimated by considering the work to be done and the speed at which

that work will be performed, taking (time-varying) parallelism into account. To get

processing speeds, ParaTimer relies on earlier debug runs of the same query on input

data samples generated by the user. In addition, ParaTimer identifies the critical

path in a query plan where it then estimates progress along that path, effectively

ignoring other paths. Zaharia et al. [ 236 ] have presented an approach to estimate the

progress of MapReduce tasks within environments of clusters with heterogenous

hardware configurations. In these environments, choosing the node on which to run

a speculative task is as important as choosing the task. They proposed an algorithm

for speculative execution called LATE (Longest Approximate Time to End) which

is based on three principles: prioritizing tasks to speculate, selecting fast nodes on

which to run and capping speculative tasks to prevent thrashing. In particular, the

algorithm speculatively execute the task that it suspects will finish farthest into

the future, because this task provides the greatest opportunity for a speculative

copy to overtake the original and reduce the job's response time. To really get the

best chance of beating the original task with the speculative task, the algorithm

only launches speculative tasks on fast nodes (and not the first available node).

The RAFT ( R ecovery A lgorithms for F ast- T racking) system [ 197 , 198 ] has been

introduced, as a part of the Hadoop CC system [ 123 ], for tracking and recovering

MapReduce jobs under task or node failures. In particular, RAFT uses two main

checkpointing mechanisms: local checkpointing and query metadata checkpointing .

On the one hand, the main idea of local checkpointing is to utilize intermediate

results, which are by default persisted by Hadoop, as checkpoints of ongoing task

progress computation. In general, map tasks spill buffered intermediate results to

local disk whenever the output buffer is on the verge to overflow. RAFT exploits

this spilling phase to piggy-back checkpointing metadata on the latest spill of each

Cloud Data Management

Search WWH ::

Custom Search

Home