Database Reference
In-Depth Information
level element in the allocation file, and on a per-queue basis by setting the
minSharePreemptionTimeout element for a queue.
Likewise, if a queue remains below half of its fair share for as long as the fair share pree-
mption timeout , then the scheduler may preempt other containers. The default timeout is
set for all queues via the defaultFairSharePreemptionTimeout top-level ele-
ment in the allocation file, and on a per-queue basis by setting fairSharePreemp-
tionTimeout on a queue. The threshold may also be changed from its default of 0.5 by
setting defaultFairSharePreemptionThreshold and fairSharePreemp-
tionThreshold (per-queue).
Delay Scheduling
All the YARN schedulers try to honor locality requests. On a busy cluster, if an applica-
tion requests a particular node, there is a good chance that other containers are running on
it at the time of the request. The obvious course of action is to immediately loosen the loc-
ality requirement and allocate a container on the same rack. However, it has been ob-
served in practice that waiting a short time (no more than a few seconds) can dramatically
increase the chances of being allocated a container on the requested node, and therefore
increase the efficiency of the cluster. This feature is called delay scheduling , and it is sup-
ported by both the Capacity Scheduler and the Fair Scheduler.
Every node manager in a YARN cluster periodically sends a heartbeat request to the re-
source manager — by default, one per second. Heartbeats carry information about the
node manager's running containers and the resources available for new containers, so
each heartbeat is a potential scheduling opportunity for an application to run a container.
When using delay scheduling, the scheduler doesn't simply use the first scheduling oppor-
tunity it receives, but waits for up to a given maximum number of scheduling opportunit-
ies to occur before loosening the locality constraint and taking the next scheduling oppor-
tunity.
For the Capacity Scheduler, delay scheduling is configured by setting
yarn.scheduler.capacity.node-locality-delay to a positive integer rep-
resenting the number of scheduling opportunities that it is prepared to miss before loosen-
ing the node constraint to match any node in the same rack.
The Fair Scheduler also uses the number of scheduling opportunities to determine the
delay, although it is expressed as a proportion of the cluster size. For example, setting
yarn.scheduler.fair.locality.threshold.node to 0.5 means that the
scheduler should wait until half of the nodes in the cluster have presented scheduling op-
portunities before accepting another node in the same rack. There is a corresponding prop-
Search WWH ::




Custom Search