Database Reference
In-Depth Information
A flat group
A hierarchical group
T0
T0
T4
T1
T4
T1
T3
T2
T3
T2
(+) Symmetrical
(+) No single point of failure
(-) Decision making is complicated
(+) Decision making is simple
(-) Asymmetrical
(-) Exhibits single point of failure
(a)
(b)
FIGURE 1.15 Two classical ways to employ task redundancy. (a) A flat group of tasks. (b) A
hierarchical group of tasks with a central process (i.e., T0, whereby T i stands for task i .).
performance bottleneck (especially in large-scale systems with millions of users). In
contrast, as long as the coordinator is protected, the whole group remains functional.
Furthermore, decisions can be easily made, solely by the coordinator without both-
ering any worker or incurring communication delays and performance overheads.
As a real example, Hadoop MapReduce applies task resiliency to recover from
task failures and mitigate the effects of slow tasks. Specifically, Hadoop MapReduce
suggests monitoring and replicating tasks in an attempt to detect and treat slow/
faulty ones. To detect slow/faulty tasks, Hadoop MapReduce depends on what is
denoted as the heartbeat mechanism . As pointed out earlier, MapReduce adopts a
master/slave architecture. Slaves (or TaskTrackers) send their heartbeats every 3 sec-
onds (by default) to the master (or the JobTracker). The JobTracker employs an expiry
thread that checks these heartbeats and decides whether tasks at TaskTrackers are
dead or alive . If the expiry thread does not receive heartbeats from a task in 10 min-
utes (by default), the task is deemed dead. Otherwise, the task is marked alive. Alive
tasks can be slow (referred interchangeably to as stragglers ) or not-slow . To measure
the slowness of tasks, the JobTracker calculates task progresses using a progress
score per each task between 0 and 1. The progress scores of map and reduce tasks
are computed differently. For a map task, the progress score is a function of the input
HDFS block read so far. For a reduce task the progress score is more involved. To
elaborate, the execution of a reduce task is split into three main stages, the Shuffle, ,
the Merge and Sort , and the reduce stages. Hadoop MapReduce assumes that each of
these stages accounts for one third of a reduce task's score. Per each stage, the score
is the fraction of data processed so far. For instance, a reduce task halfway through
Search WWH ::




Custom Search