Scheduling and Workflow - Big Data Made Easy: A Working Guide to the Complete Hadoop Toolset

Database Reference

In-Depth Information

However, you need to modify the property mapred.jobtracker.taskScheduler in the file mapred-site.xml within

the conf directory, as follows:

<name>mapred.jobtracker.taskScheduler</name>

<value>org.apache.hadoop.mapred.FairScheduler</value>

<description>Plugin the Fair scheduler</description>

</property>

To give greater control, I add some properties to the mapred-site.xml file to switch off pre-emption by

setting mapred.fairscheduler.preemption to False and I disallow unspecified pool names by setting

mapred.fairscheduler.allow.undeclared.pools to False. Also, I assign pool property names to queue names by

using mapred.fairscheduler.poolnameproperty . Finally, I use the mapred.queue.names property to define a list of

allowed queue names that could be used in the configuration file, all as follows:

<name>mapred.fairscheduler.preemption</name>

<value>false</value>

</property>

<name>mapred.fairscheduler.allow.undeclared.pools</name>

<value>false</value>

</property>

<name>mapred.fairscheduler.poolnameproperty</name>

<value>mapred.job.queue.name</value>

</property>

<name>mapred.queue.names</name>

<value>high_pool,low_pool,default</value>

</property>

To see the full configuration guide, go to the Apache Software Foundation website

( hadoop.apache.org/docs/r1.2.1/ ), click Map Reduce, and then select Fair Scheduler.

Like the configuration for the Capacity scheduler, you can add access control in the mapred-queue-acls.xml file

for the Fair scheduler to specify user and administration access to each queue. For example, here I grant the hadoop

user access to the high_pool and administration access to that queue, as follows:

<name>mapred.queue.high_pool.acl-submit-job</name>

<value>hadoop</value>

</property>

<name>mapred.queue.low_pool.acl-submit-job</name>

<value>smitha</value>

</property>

Search WWH ::

Custom Search

Home