Database Reference
In-Depth Information
So, you have the /tmp and /user directories, and the /var/log directory exists with a subdirectory for YARN. Now,
you need to create home directories for the Map Reduce users on HDFS. In this case there is only the hadoop account,
so you change user (su) to the Linux hdfs account, then use the hadoop file system command mkdir to create the
directory and use chown to set its ownership to hadoop:
[root@hc1nn sysconfig]# su - hdfs
-bash-4.1$
-bash-4.1$ hadoop fs -mkdir /user/hadoop
-bash-4.1$ hadoop fs -chown hadoop /user/hadoop
Last step is to set up the Map Reduce user environment in the Bash shell by setting some environmental options
in the Bash configuration file .bashrc. As in the Hadoop V1 installation, this allows you to set environment variables
like JAVA_HOME and HADOOP_MAPRED_HOME in the Bash shell. Each time the Linux account is accessed and a Bash shell is
created, these variables will be pre-defined.
[hadoop@hc1nn ~]$ tail .bashrc
#######################################################
# Set Hadoop related env variables
# set JAVA_HOME (we will also set a hadoop specific value later)
export JAVA_HOME=/usr/lib/jvm/jre-1.6.0-openjdk
export HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce
At this point you have completed the configuration of your installation and you are ready to start the servers.
Remember to monitor the logs under /var/log for server errors; when the servers start, they state the location where
they are logging to.
You start the HDFS servers and monitor the logs for errors:
[root@hc1nn init.d]# cd /etc/init.d
[root@hc1nn init.d]# ls -ld hadoop-hdfs-*
-rwxr-xr-x. 1 root root 4469 Feb 26 23:18 hadoop-hdfs-namenode
[root@hc1nn init.d]# service hadoop-hdfs-namenode start
Starting Hadoop namenode: [ OK ]
starting namenode, logging to /var/log/hadoop-hdfs/hadoop-hdfs-namenode-hc1nn.out
If you start the name node on the master and the data nodes on the slaves, check that the necessary ports
are open in the firewall. For example, I check that port 8020 is open in the firewall configuration. I also show how
the Iptables (Linux kernel firewall) service can be restarted. (I do not provide an in-depth study of the firewall
configuration, as that leads us into the realm of systems administration, which is a separate field.)
[root@hc1nn sysconfig]# cd /etc/sysconfig
[root@hc1nn sysconfig]# grep 8020 iptables
-A INPUT -m state --state NEW -m tcp -p tcp --dport 8020 -j ACCEPT
So, this result tells the firewall to accept tcp-based requests on port 8020.
If the ports are not open, then Iptables needs to be updated with an entry similar to the last line above and the
server restarted, as shown next. (If you are unsure about this, consult your systems administrator.)
Search WWH ::




Custom Search