Database Reference
In-Depth Information
execution to avoid these problems is unwise and won't work reliably, since the same bugs
are likely to affect the speculative task. You should fix the bug so that the task doesn't
hang or slow down.
Speculative execution is turned on by default. It can be enabled or disabled independently
for map tasks and reduce tasks, on a cluster-wide basis, or on a per-job basis. The relevant
properties are shown in Table 7-4 .
Table 7-4. Speculative execution properties
Property name
Type
Default value
mapreduce.map.speculative
boolean true
mapreduce.reduce.speculative
boolean true
yarn.app.mapreduce.am.job.speculator.class
org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator
Class
yarn.app.mapreduce.am.job.task.estimator.class Class
org.apache.hadoop.mapreduce.v2.app.speculate.LegacyTaskRuntimeEstimator An implementation of
Why would you ever want to turn speculative execution off? The goal of speculative exe-
cution is to reduce job execution time, but this comes at the cost of cluster efficiency. On a
busy cluster, speculative execution can reduce overall throughput, since redundant tasks
are being executed in an attempt to bring down the execution time for a single job. For
this reason, some cluster administrators prefer to turn it off on the cluster and have users
explicitly turn it on for individual jobs. This was especially relevant for older versions of
Hadoop, when speculative execution could be overly aggressive in scheduling speculative
tasks.
Search WWH ::




Custom Search