Database Reference
In-Depth Information
By default, this finds the namenode's hostname from
fs.defaultFS
. In slightly more
detail, the
start-dfs.sh
script does the following:
▪ Starts a namenode on each machine returned by executing
hdfs getconf -
namenodes
[
71
]
▪ Starts a datanode on each machine listed in the
slaves
file
▪ Starts a secondary namenode on each machine returned by executing
hdfs
getconf -secondarynamenodes
The YARN daemons are started in a similar way, by running the following command as
the
yarn
user on the machine hosting the resource manager:
%
start-yarn.sh
In this case, the resource manager is always run on the machine from which the
start-
yarn.sh
script was run. More specifically, the script:
▪ Starts a resource manager on the local machine
▪ Starts a node manager on each machine listed in the
slaves
file
Also provided are
stop-dfs.sh
and
stop-yarn.sh
scripts to stop the daemons started by the
corresponding start scripts.
These scripts start and stop Hadoop daemons using the
hadoop-daemon.sh
script (or the
yarn-daemon.sh
script, in the case of YARN). If you use the aforementioned scripts, you
shouldn't call
hadoop-daemon.sh
directly. But if you need to control Hadoop daemons
from another system or from your own scripts, the
hadoop-daemon.sh
script is a good in-
tegration point. Likewise,
hadoop-daemons.sh
(with an “s”) is handy for starting the same
daemon on a set of hosts.
Finally, there is only one MapReduce daemon — the job history server, which is started as
follows, as the
mapred
user:
%
mr-jobhistory-daemon.sh start historyserver
Creating User Directories
Once you have a Hadoop cluster up and running, you need to give users access to it. This
involves creating a home directory for each user and setting ownership permissions on it:
%
hadoop fs -mkdir /user/
username
%
hadoop fs -chown
username
:
username
/user/
username
This is a good time to set space limits on the directory. The following sets a 1 TB limit on
the given user directory: