Database Reference
In-Depth Information
a popular framework for indexing and searching documents, and implements that framework
by providing a set of tools for building indexes and querying data.
store data, it is not truly compatible with Hadoop and does not use MapReduce (described
fort named Blur (described
here
) to build a tool on top of the Lucene framework that lever-
ages the entire Hadoop stack.
Tutorial Links
Apart from the tutorial on the official Solr home page, there is a
Solr wiki
with great inform-
ation.
Example Code
In this example, we're going to assume we have a set of semi-structured data consisting of
movie reviews with labels that clearly mark the title and the text of the review. These reviews
will be stored in individual JSON files in the
reviews
directory.
We'll start by telling Solr to index our data; there are a handful of different ways to do this,
all with unique trade-offs. In this case, we're going to use the simplest mechanism, which is
the
post.sh
script located in the
exampledocs/
subdirectory of our Solr install:
./example/exampledocs/post.sh /reviews/*.json
Once our reviews have been indexed, they are ready to search. Solr has its own graphical
user interface (GUI) that can be used for simple searches. We'll pull up that GUI and search
for movie reviews that contain the word “great”:
review_text:great&fl=title
This search tells Solr that we want to retrieve the
title
field (
fl=title
) for any review
where the word “great” appears in the
review_text
field.