Storing and Configuring Data with Hadoop, YARN, and ZooKeeper - Big Data Made Easy: A Working Guide to the Complete Hadoop Toolset

Database Reference

In-Depth Information

hc1r1m1: starting datanode, logging to /usr/local/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-

datanode-hc1r1m1.out

hc1r1m3: starting datanode, logging to /usr/local/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-

datanode-hc1r1m3.out

hc1nn: starting datanode, logging to /usr/local/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-

datanode-hc1nn.out

hc1nn: starting secondarynamenode, logging to /usr/local/hadoop-1.2.1/libexec/../logs/hadoop-

hadoop-secondarynamenode-hc1nn.out

As mentioned, check the logs for errors under $HADOOP_PREFIX/logs on each server. If you get errors like

“No Route to Host,” it is a good indication that your firewall is blocking a port. It will save a great deal of time and

effort if you ensure that the firewall port access is open. (If you are unsure how to do this, then approach your systems

administrator.)

You can now check that the servers are running on the name node by using the jps command:

[hadoop@hc1nn ~]$ jps

2116 SecondaryNameNode

2541 Jps

1998 DataNode

1878 NameNode

If you need to stop the HDFS servers, you can use the stop-dfs.sh script. Don't do it yet, however, as you will

start the Map Reduce servers next.

With the HDFS servers running, it is now time to start the Map Reduce servers. The HDFS servers should always

be started first and stopped last. Use the start-mapred.sh script to start the Map Reduce servers, as follows:

[hadoop@hc1nn logs]$ start-mapred.sh

starting jobtracker, logging to

/usr/local/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-jobtracker-hc1nn.out

hc1r1m2: starting tasktracker, logging to

/usr/local/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-tasktracker-hc1r1m2.out

hc1r1m3: starting tasktracker, logging to

/usr/local/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-tasktracker-hc1r1m3.out

hc1r1m1: starting tasktracker, logging to

/usr/local/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-tasktracker-hc1r1m1.out

hc1nn: starting tasktracker, logging to

/usr/local/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-tasktracker-hc1nn.out

Note that the Job Tracker has been started on the name node and a Task Tracker on each of the data nodes.

Again, check all of the logs for errors.

Running a Map Reduce Job Check

When your Hadoop V1 system has all servers up and there are no errors in the logs, you're ready to run a sample Map

Reduce job to check that you can run tasks. For example, try using some data based on works by Edgar Allan Poe. I

have downloaded this data from the Internet and have stored it on the Linux file system under /tmp/edgar. You could

use any text-based data, however, as you just want to run a test to count some words using Map Reduce. It is not the

data that is important but, rather, the correct functioning of Hadoop. To begin, go to the edgar directory, as follows:

cd /tmp/edgar

Search WWH ::

Custom Search

Home