Starting Hadoop - Hadoop in Action

Databases Reference

In-Depth Information

<name>dfs.replication</name>

<description>The actual number of replications can be specified when the

file is created.</description>

</property>

</configuration>

In core-site.xml and mapred-site.xml we specify the hostname and port of the

NameNode and the JobTracker, respectively. In hdfs-site.xml we specify the default

replication factor for HDFS, which should only be one because we're running on only

one node. We must also specify the location of the Secondary NameNode in the mas-

ters file and the slave nodes in the slaves file:

[hadoop-user@master]$ cat masters

localhost

[hadoop-user@master]$ cat slaves

localhost

While all the daemons are running on the same machine, they still communicate

with each other using the same SSH protocol as if they were distributed over a cluster.

Section 2.2 has a more detailed discussion of setting up the SSH channels, but for

single-node operation simply check to see if your machine already allows you to ssh

back to itself.

[hadoop-user@master]$ ssh localhost

If it does, then you're good. Otherwise setting up takes two lines.

[hadoop-user@master]$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

[hadoop-user@master]$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

You are almost ready to start Hadoop. But first you'll need to format your HDFS by

using the command

[hadoop-user@master]$ bin/hadoop namenode -format

We can now launch the daemons by use of the start-all.sh script. The Java jps

command will list all daemons to verify the setup was successful.

[hadoop-user@master]$ bin/start-all.sh

[hadoop-user@master]$ jps

26893 Jps

26832 TaskTracker

26620 SecondaryNameNode

26333 NameNode

26484 DataNode

26703 JobTracker

Search WWH ::

Custom Search

Home