Database Reference
In-Depth Information
[hadoop@hc1nn ~]$ ls -l .bashrc*
lrwxrwxrwx. 1 hadoop hadoop 16 Jun 30 17:59 .bashrc -> .bashrc_hadoopv2
-rw-r--r--. 1 hadoop hadoop 1586 Jun 18 17:08 .bashrc_hadoopv1
-rw-r--r--. 1 hadoop hadoop 1588 Jul 27 11:33 .bashrc_hadoopv2
The Linux pwd command shows that the current location is the Linux hadoop user's home directory /home/
hadoop/. The Linux ls command produces a long listing that shows a symbolic link called .bashrc, which points
to either a Hadoop V1 or a V2 version of the bashrc configuration file. Currently it is pointing to V2, so you need to
change it back to V1. (I will not explain the contents of the files, as they are listed in Chapter 2).
Delete the symbolic link named .bashrc by using the Linux rm command, then re-create it to point to the V1 file
by using the Linux ln command with a -s (symbolic) switch:
[hadoop@hc1nn ~]$ rm .bashrc
[hadoop@hc1nn ~]$ ln -s .bashrc_hadoopv1 .bashrc
[hadoop@hc1nn ~]$ ls -l .bashrc*
lrwxrwxrwx 1 hadoop hadoop 16 Nov 12 18:32 .bashrc -> .bashrc_hadoopv1
-rw-r--r--. 1 hadoop hadoop 1586 Jun 18 17:08 .bashrc_hadoopv1
-rw-r--r--. 1 hadoop hadoop 1588 Jul 27 11:33 .bashrc_hadoopv2
That creates the correct environment configuration file for the Linux hadopop account, but how does it now take
effect? Either log out using the exit command and log back in, or use the following:
[hadoop@hc1nn ~]$ . ./.bashrc
This means that the .bashrc is executed in the current shell (denoted by the first “ . ” character). The ./ specifies
that the .bashrc file is sourced from the current directory. Now, you are ready to start the Hadoop V1 servers.
Starting the Servers
The Hadoop V1 environment has been configured, and the V2 Hadoop servers have already been stopped. Now, you
change to the proper directory and start the servers:
[hadoop@hc1nn ~]$ cd $HADOOP_PREFIX/bin
[hadoop@hc1nn hadoop]$ pwd
/usr/local/hadoop/bin/
[hadoop@hc1nn bin]$ ./start-dfs.sh
[hadoop@hc1nn bin]$ ./start-mapred.sh
These commands change the directory to the /usr/local/hadoop/bin/ directory using the HADOOP_PREFIX
variable. The HDFS servers are started using the start-dfs.sh script, followed by the Map Reduce servers with start-
mapred.sh. At this point, you can begin the Nutch work, using Hadoop V1 on this cluster.
Architecture 1: Nutch 1.x
This first example illustrates how Nutch, Solr, and Hadoop work together. You will learn how to download, install, and
configure Nutch 1.8 and Solr, as well as how to set up your environment and build Nutch. With the prep work finished,
I'll walk you through running a sample Nutch crawl using Solr and then storing the data on the Hadoop file system.
 
Search WWH ::




Custom Search