Storing and Configuring Data with Hadoop, YARN, and ZooKeeper - Big Data Made Easy: A Working Guide to the Complete Hadoop Toolset

Database Reference

In-Depth Information

[hadoop@hc1r1m2 ~]$ exit

logout

Connection to hc1r1m2 closed.

[hadoop@hc1nn ~]$ ssh hadoop@hc1r1m1

Last login: Thu Mar 13 19:40:22 2014 from hc1r1m3

[hadoop@hc1r1m1 ~]$ java -version

java version "1.6.0_30"

OpenJDK Runtime Environment (IcedTea6 1.13.1) (rhel-3.1.13.1.el6_5-x86_64)

OpenJDK 64-Bit Server VM (build 23.25-b01, mixed mode)

[hadoop@hc1r1m1 ~]$ exit

logout

Connection to hc1r1m1 closed.

These three SSH statements show that a secure shell session can be created from the name node, hc1nn, to each

of the data nodes.

Notice that I am using the Java OpenJDK ( http://openjdk.java.net/ ) here. Generally it's advised that you use

the Oracle Sun JDK. However, Hadoop has been tested against the OpenJDK, and I am familiar with its use. I don't

need to register to use OpenJDK, and I can install it on Centos using a simple yum command. Additionally, the Sun

JDK install is more complicated.

Now let's download and install a version of Hadoop V1. In order to find the release of Apache Hadoop to

download, start here: http://hadoop.apache.org .

Next, choose Download Hadoop, click the release option, then choose Download, followed by Download a

Release Now! This will bring you to this page: http://www.apache.org/dyn/closer.cgi/hadoop/common/ . It suggests

a local mirror site that you can use to download the software. It's a confusing path to follow; I'm sure that this website

could be simplified a little. The suggested link for me is http://apache.insync.za.net/hadoop/common . You may be

offered a different link.

On selecting that site, I'm offered a series of releases. I choose 1.2.1, and then I download the file: Hadoop-

1.2.1.tar.gz. Why choose this particular format over the others? From past experience, I know how to unpack it and use

it; feel free to choose the format with which you're most comfortable.

Download the file to /home/hadoop/Downloads. (This download and installation must be carried out on each

server.) You are now ready to begin the Hadoop single-node installation for Hadoop 1.2.1.

The approach from this point on will be to install Hadoop onto each server separately as a single-node installation,

configure it, and try to start the servers. This will prove that each node is correctly configured individually. After that,

the nodes will be grouped into a Hadoop master/slave cluster. The next section describes the single-node installation

and test, which should be carried out on all nodes. This will involve unpacking the software, configuring the

environment files, formatting the file system, and starting the servers. This is a manual process; if you have a very large

production cluster, you would need to devise a method of automating the process.

Hadoop 1.2.1 Single-Node Installation

From this point on, you will be carrying out a single-node Hadoop installation (until you format the Hadoop file

system on this node). First, you ftp the file hadoop-1.2.1.tar.gz to all of your nodes and carry out the steps in this

section on all nodes.

So, given that you are logged in as the user hadoop, you see the following file in the $HOME/Downloads

directory:

[hadoop@hc1nn Downloads]$ ls -l

total 62356

-rw-rw-r--. 1 hadoop hadoop 63851630 Mar 15 15:01 hadoop-1.2.1.tar.gz

Big Data Made Easy: A Working Guide to the Complete Hadoop Toolset

Search WWH ::

Custom Search

Home