Database Reference
In-Depth Information
Commands to administer HDFS
Hadoop provides several commands to administer HDFS. The following are two of the
commonly used administration commands in HDFS:
•
balancer
: In a cluster, new datanodes can be added. The addition of new datan-
odes provides more storage space for the cluster. However, when a new datanode is
added, the datanode does not have any files. Due to the addition of the new datan-
ode, data blocks across all the datanodes are in a state of imbalance, that is, they
are not evenly spread across the datanodes. The administrator can use the
balan-
cer
command to balance the cluster. The balancer can be invoked using this com-
mand.
The syntax of the
balancer
command is
hdfs balancer -threshold
<threshold>
. Here,
threshold
is the balancing threshold expressed in per-
centage. The threshold is specified as a float value that ranges from 0 to 100. The
default threshold values is 10. The balancer tries to distribute blocks to the underu-
tilized datanodes. For example, if the average utilization of all the datanodes in the
cluster is 50 percent, the balancer, by default, will try to pick up blocks from nodes
that have a utilization of above 60 percent (50 percent + 10 percent) and move
them to nodes that have a utilization of below 40 percent (50 percent - 10 percent).
•
dfsadmin
: The
dfsadmin
command is used to run administrative commands
on HDFS.
The syntax of the
dfsadmin
command is
hadoop dfsadmin <options>
.
Let's understand a few of the important command options and the actions they per-
form:
◦
[-report]
: This generates a report of the basic filesystem information
and statistics.
◦
[-safemode <enter | leave | get | wait>]
: This safe
mode is a namenode state in which it does not accept changes to the
namespace (read-only) and does not replicate or delete blocks.
◦
[-saveNamespace]
: This saves the current state of the namespace to a
storage directory and resets the
edits
log.
◦
[-rollEdits]
: This forces a rollover of the
edits
log, that is, it saves
the state of the current
edits
log and creates a fresh
edits
log for new
transactions.