Database Reference
In-Depth Information
In addition to the memory requirements of the daemons, the node manager allocates con-
tainers to applications, so we need to factor these into the total memory footprint of a
worker machine; see Memory settings in YARN and MapReduce .
System logfiles
System logfiles produced by Hadoop are stored in $HADOOP_HOME/logs by default.
This can be changed using the HADOOP_LOG_DIR setting in hadoop-env.sh . It's a good
idea to change this so that logfiles are kept out of the directory that Hadoop is installed in.
Changing this keeps logfiles in one place, even after the installation directory changes due
to an upgrade. A common choice is /var/log/hadoop , set by including the following line in
hadoop-env.sh :
export HADOOP_LOG_DIR=/var/log/hadoop
The log directory will be created if it doesn't already exist. (If it does not exist, confirm
that the relevant Unix Hadoop user has permission to create it.) Each Hadoop daemon
running on a machine produces two logfiles. The first is the log output written via log4j.
This file, whose name ends in .log , should be the first port of call when diagnosing prob-
lems because most application log messages are written here. The standard Hadoop log4j
configuration uses a daily rolling file appender to rotate logfiles. Old logfiles are never de-
leted, so you should arrange for them to be periodically deleted or archived, so as to not
run out of disk space on the local node.
The second logfile is the combined standard output and standard error log. This logfile,
whose name ends in .out , usually contains little or no output, since Hadoop uses log4j for
logging. It is rotated only when the daemon is restarted, and only the last five logs are re-
tained. Old logfiles are suffixed with a number between 1 and 5, with 5 being the oldest
file.
Logfile names (of both types) are a combination of the name of the user running the dae-
mon, the daemon name, and the machine hostname. For example, hadoop-hdfs-datanode-
ip-10-45-174-112.log.2014-09-20 is the name of a logfile after it has been rotated. This
naming structure makes it possible to archive logs from all machines in the cluster in a
single directory, if needed, since the filenames are unique.
The username in the logfile name is actually the default for the
HADOOP_IDENT_STRING setting in hadoop-env.sh . If you wish to give the Hadoop in-
stance a different identity for the purposes of naming the logfiles, change
HADOOP_IDENT_STRING to be the identifier you want.
Search WWH ::




Custom Search