Databases Reference
In-Depth Information
<property>
<name>mapred.job.tracker</name>
w
Locate JobTracker
master
<value>master:9001</value>
<description>The host and port that the MapReduce job tracker runs
at.</description>
</property>
</configuration>
hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
e
Increase HDFS
replication factor
<value>3</value>
<description>The actual number of replications can be specified when the
file is created.</description>
</property>
</configuration>
The key differences are
We explicitly stated the hostname for location of the NameNode q and
JobTracker w daemons.
We increased the HDFS replication factor
to take advantage of distributed
storage e . Recall that data is replicated across HDFS to increase availability and
reliability.
We also need to update the masters and slaves files
to reflect the locations of the other
daemons.
[hadoop-user@master]$ cat masters
backup
[hadoop-user@master]$ cat slaves
hadoop1
hadoop2
hadoop3
...
Once you have copied these files across all the nodes in your cluster, be sure to format
HDFS to prepare it for storage:
[hadoop-user@master]$ bin/hadoop namenode-format
Now you can start the Hadoop daemons:
[hadoop-user@master]$ bin/start-all.sh
and verify the nodes are running their assigned jobs.
[hadoop-user@master]$ jps
30879 JobTracker
30717 NameNode
 
Search WWH ::




Custom Search