Database Reference
In-Depth Information
[hadoop@hc1r1m2 ~]$ exit
logout
Connection to hc1r1m2 closed.
[hadoop@hc1nn ~]$ ssh hadoop@hc1r1m1
Last login: Thu Mar 13 19:40:22 2014 from hc1r1m3
[hadoop@hc1r1m1 ~]$ java -version
java version "1.6.0_30"
OpenJDK Runtime Environment (IcedTea6 1.13.1) (rhel-3.1.13.1.el6_5-x86_64)
OpenJDK 64-Bit Server VM (build 23.25-b01, mixed mode)
[hadoop@hc1r1m1 ~]$ exit
logout
Connection to hc1r1m1 closed.
These three SSH statements show that a secure shell session can be created from the name node, hc1nn, to each
of the data nodes.
Notice that I am using the Java OpenJDK ( http://openjdk.java.net/ ) here. Generally it's advised that you use
the Oracle Sun JDK. However, Hadoop has been tested against the OpenJDK, and I am familiar with its use. I don't
need to register to use OpenJDK, and I can install it on Centos using a simple yum command. Additionally, the Sun
JDK install is more complicated.
Now let's download and install a version of Hadoop V1. In order to find the release of Apache Hadoop to
download, start here: http://hadoop.apache.org .
Next, choose Download Hadoop, click the release option, then choose Download, followed by Download a
Release Now! This will bring you to this page: http://www.apache.org/dyn/closer.cgi/hadoop/common/ . It suggests
a local mirror site that you can use to download the software. It's a confusing path to follow; I'm sure that this website
could be simplified a little. The suggested link for me is http://apache.insync.za.net/hadoop/common . You may be
offered a different link.
On selecting that site, I'm offered a series of releases. I choose 1.2.1, and then I download the file: Hadoop-
1.2.1.tar.gz. Why choose this particular format over the others? From past experience, I know how to unpack it and use
it; feel free to choose the format with which you're most comfortable.
Download the file to /home/hadoop/Downloads. (This download and installation must be carried out on each
server.) You are now ready to begin the Hadoop single-node installation for Hadoop 1.2.1.
The approach from this point on will be to install Hadoop onto each server separately as a single-node installation,
configure it, and try to start the servers. This will prove that each node is correctly configured individually. After that,
the nodes will be grouped into a Hadoop master/slave cluster. The next section describes the single-node installation
and test, which should be carried out on all nodes. This will involve unpacking the software, configuring the
environment files, formatting the file system, and starting the servers. This is a manual process; if you have a very large
production cluster, you would need to devise a method of automating the process.
Hadoop 1.2.1 Single-Node Installation
From this point on, you will be carrying out a single-node Hadoop installation (until you format the Hadoop file
system on this node). First, you ftp the file hadoop-1.2.1.tar.gz to all of your nodes and carry out the steps in this
section on all nodes.
So, given that you are logged in as the user hadoop, you see the following file in the $HOME/Downloads
directory:
[hadoop@hc1nn Downloads]$ ls -l
total 62356
-rw-rw-r--. 1 hadoop hadoop 63851630 Mar 15 15:01 hadoop-1.2.1.tar.gz
 
Search WWH ::




Custom Search