Database Reference
In-Depth Information
erty, yarn.scheduler.fair.locality.threshold.rack , for setting the
threshold before another rack is accepted instead of the one requested.
Dominant Resource Fairness
When there is only a single resource type being scheduled, such as memory, then the
concept of capacity or fairness is easy to determine. If two users are running applications,
you can measure the amount of memory that each is using to compare the two applica-
tions. However, when there are multiple resource types in play, things get more complic-
ated. If one user's application requires lots of CPU but little memory and the other's re-
quires little CPU and lots of memory, how are these two applications compared?
The way that the schedulers in YARN address this problem is to look at each user's dom-
inant resource and use it as a measure of the cluster usage. This approach is called Domin-
ant Resource Fairness , or DRF for short. [ 43 ] The idea is best illustrated with a simple ex-
ample.
Imagine a cluster with a total of 100 CPUs and 10 TB of memory. Application A requests
containers of (2 CPUs, 300 GB), and application B requests containers of (6 CPUs, 100
GB). A's request is (2%, 3%) of the cluster, so memory is dominant since its proportion
(3%) is larger than CPU's (2%). B's request is (6%, 1%), so CPU is dominant. Since B's
container requests are twice as big in the dominant resource (6% versus 3%), it will be al-
located half as many containers under fair sharing.
By default DRF is not used, so during resource calculations, only memory is considered
and CPU is ignored. The Capacity Scheduler can be configured to use DRF by setting
yarn.scheduler.capacity.resource-calculator to
org.apache.hadoop.yarn.util.re-
source.DominantResourceCalculator in capacity-scheduler.xml .
For the Fair Scheduler, DRF can be enabled by setting the top-level element de-
faultQueueSchedulingPolicy in the allocation file to drf .
Search WWH ::




Custom Search