Database Reference
In-Depth Information
hc1r1m1: starting datanode, logging to /usr/local/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-
datanode-hc1r1m1.out
hc1r1m3: starting datanode, logging to /usr/local/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-
datanode-hc1r1m3.out
hc1nn: starting datanode, logging to /usr/local/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-
datanode-hc1nn.out
hc1nn: starting secondarynamenode, logging to /usr/local/hadoop-1.2.1/libexec/../logs/hadoop-
hadoop-secondarynamenode-hc1nn.out
As mentioned, check the logs for errors under $HADOOP_PREFIX/logs on each server. If you get errors like
“No Route to Host,” it is a good indication that your firewall is blocking a port. It will save a great deal of time and
effort if you ensure that the firewall port access is open. (If you are unsure how to do this, then approach your systems
administrator.)
You can now check that the servers are running on the name node by using the jps command:
[hadoop@hc1nn ~]$ jps
2116 SecondaryNameNode
2541 Jps
1998 DataNode
1878 NameNode
If you need to stop the HDFS servers, you can use the stop-dfs.sh script. Don't do it yet, however, as you will
start the Map Reduce servers next.
With the HDFS servers running, it is now time to start the Map Reduce servers. The HDFS servers should always
be started first and stopped last. Use the start-mapred.sh script to start the Map Reduce servers, as follows:
[hadoop@hc1nn logs]$ start-mapred.sh
starting jobtracker, logging to
/usr/local/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-jobtracker-hc1nn.out
hc1r1m2: starting tasktracker, logging to
/usr/local/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-tasktracker-hc1r1m2.out
hc1r1m3: starting tasktracker, logging to
/usr/local/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-tasktracker-hc1r1m3.out
hc1r1m1: starting tasktracker, logging to
/usr/local/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-tasktracker-hc1r1m1.out
hc1nn: starting tasktracker, logging to
/usr/local/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-tasktracker-hc1nn.out
Note that the Job Tracker has been started on the name node and a Task Tracker on each of the data nodes.
Again, check all of the logs for errors.
Running a Map Reduce Job Check
When your Hadoop V1 system has all servers up and there are no errors in the logs, you're ready to run a sample Map
Reduce job to check that you can run tasks. For example, try using some data based on works by Edgar Allan Poe. I
have downloaded this data from the Internet and have stored it on the Linux file system under /tmp/edgar. You could
use any text-based data, however, as you just want to run a test to count some words using Map Reduce. It is not the
data that is important but, rather, the correct functioning of Hadoop. To begin, go to the edgar directory, as follows:
cd /tmp/edgar
 
Search WWH ::




Custom Search