Database Reference
In-Depth Information
WHICH PROPERTIES CAN I SET?
ConfigurationPrinter is a useful tool for discovering what a property is set to in your environ-
ment. For a running daemon, like the namenode, you can see its configuration by viewing the /conf page
on its web server. (See Table 10-6 to find port numbers.)
You can also see the default settings for all the public properties in Hadoop by looking in the share/doc
directory of your Hadoop installation for files called core-default.xml , hdfs-default.xml , yarn-default.xml ,
and mapred-default.xml . Each property has a description that explains what it is for and what values it
can be set to.
The default settings files' documentation can be found online at pages linked from ht-
tp://hadoop.apache.org/docs/current/ (look for the “Configuration” heading in the navigation). You can
find the defaults for a particular Hadoop release by replacing current in the preceding URL with r<ver-
sion> — for example, http://hadoop.apache.org/docs/r2.5.0/ .
Be aware that some properties have no effect when set in the client configuration. For example, if you set
yarn.nodemanager.resource.memory-mb in your job submission with the expectation that it
would change the amount of memory available to the node managers running your job, you would be
disappointed, because this property is honored only if set in the node manager's yarn-site.xml file. In
general, you can tell the component where a property should be set by its name, so the fact that
yarn.nodemanager.resource.memory-mb starts with yarn.nodemanager gives you a clue
that it can be set only for the node manager daemon. This is not a hard and fast rule, however, so in some
cases you may need to resort to trial and error, or even to reading the source.
Configuration property names have changed in Hadoop 2 onward, in order to give them a more regular
naming structure. For example, the HDFS properties pertaining to the namenode have been changed to
have a dfs.namenode prefix, so dfs.name.dir is now dfs.namenode.name.dir . Similarly,
MapReduce properties have the mapreduce prefix rather than the older mapred prefix, so
mapred.job.name is now mapreduce.job.name .
This topic uses the new property names to avoid deprecation warnings. The old property names still
work, however, and they are often referred to in older documentation. You can find a table listing the de-
precated property names and their replacements on the Hadoop website .
We discuss many of Hadoop's most important configuration properties throughout this topic.
GenericOptionsParser also allows you to set individual properties. For example:
% hadoop ConfigurationPrinter -D color=yellow | grep color
color=yellow
Here, the -D option is used to set the configuration property with key color to the value
yellow . Options specified with -D take priority over properties from the configuration
files. This is very useful because you can put defaults into configuration files and then
override them with the -D option as needed. A common example of this is setting the
number of reducers for a MapReduce job via -D mapreduce.job.reduces= n . This
Search WWH ::




Custom Search