Database Reference
In-Depth Information
NOTE
The file (or files) specified by the dfs.hosts and yarn.resourcemanager.nodes.include-
path properties is different from the slaves file. The former is used by the namenode and resource man-
ager to determine which worker nodes may connect. The slaves file is used by the Hadoop control scripts
to perform cluster-wide operations, such as cluster restarts. It is never used by the Hadoop daemons.
To add new nodes to the cluster:
1. Add the network addresses of the new nodes to the include file.
2. Update the namenode with the new set of permitted datanodes using this com-
mand:
% hdfs dfsadmin -refreshNodes
3. Update the resource manager with the new set of permitted node managers using:
% yarn rmadmin -refreshNodes
4. Update the slaves file with the new nodes, so that they are included in future op-
erations performed by the Hadoop control scripts.
5. Start the new datanodes and node managers.
6. Check that the new datanodes and node managers appear in the web UI.
HDFS will not move blocks from old datanodes to new datanodes to balance the cluster.
To do this, you should run the balancer described in Balancer .
Decommissioning old nodes
Although HDFS is designed to tolerate datanode failures, this does not mean you can just
terminate datanodes en masse with no ill effect. With a replication level of three, for ex-
ample, the chances are very high that you will lose data by simultaneously shutting down
three datanodes if they are on different racks. The way to decommission datanodes is to
inform the namenode of the nodes that you wish to take out of circulation, so that it can
replicate the blocks to other datanodes before the datanodes are shut down.
With node managers, Hadoop is more forgiving. If you shut down a node manager that is
running MapReduce tasks, the application master will notice the failure and reschedule
the tasks on other nodes.
The decommissioning process is controlled by an exclude file , which is set for HDFS iby
the dfs.hosts.exclude property and for YARN by the
yarn.resourcemanager.nodes.exclude-path property. It is often the case
Search WWH ::




Custom Search