Database Reference
In-Depth Information
Update it to the following to provide filtering functionality for the parsed tokens:
<filter class="solr.SnowballPorterFilterFactory" protected="protwords.txt"
language="English"/>
Add the following lines at the end of the <fields> section to define some extra field types in the documents to be
parsed. (By making these changes you will avoid Nutch document parsing errors):
<!-- fields for Nutch -->
<field name="_version_" type="long" indexed="true" stored="true"/>
<field name="text" type="string" indexed="true" stored="true"/>
Build Nutch as you did in the last Nutch release:
[hadoop@hc1r1m2 nutch]$ pwd
/usr/local/nutch
[hadoop@hc1r1m2 nutch]$ ant
Buildfile: build.xml
....
BUILD SUCCESSFUL
Total time: 1 minute 47 seconds
Note: That was a quick build. (As you remember, the last Nutch build took more than 11 minutes). With Nutch
built, you are ready to install Apache HBase, the Hadoop-based database, and test it.
HBase Installation
The pieces are moving into place for this second architecture example. Nutch is installed and built, as well as
configured to use Gora and HBase. The Gora component was included with the Nutch 2.x release, and Apache
ZooKeeper was installed already as part of Chapter2's installation. Now you need to install Apache HBase.
To demonstrate its use, I show how to install HBase on a single server.
You can download HBase from the HBase website ( hbase.apache.org ). After clicking the Downloads option on
the left of the page, you may be directed to an alternative mirror site. That's fine—just follow the link. (I downloaded
the 0.90.4 release). Again, it is a gzipped tar file that needs to be unpacked.
[hadoop@hc1r1m2 Downloads]$ ls -l hbase-0.90.4.tar.gz
-rw-rw-r--. 1 hadoop hadoop 37161251 Apr 8 18:36 hbase-0.90.4.tar.gz
[hadoop@hc1r1m2 Downloads]$ gunzip hbase-0.90.4.tar.gz
[hadoop@hc1r1m2 Downloads]$ tar xvf hbase-0.90.4.tar
Move the unpacked release to /usr/local, and change the ownership to the Linux hadoop user recursively
with chown -R . Then, create a symbolic link called “hbase” under /usr/local/ to simplify both the path and the
environment setup.
root@hc1r1m2 Downloads]# mv hbase-0.90.4 /usr/local
[root@hc1r1m2 Downloads]# cd /usr/local
[root@hc1r1m2 local]# chown -R hadoop:hadoop hbase-0.90.4
[root@hc1r1m2 local]# ln -s hbase-0.90.4 hbase
 
Search WWH ::




Custom Search