Database Reference
In-Depth Information
a popular framework for indexing and searching documents, and implements that framework
by providing a set of tools for building indexes and querying data.
While Solr is able to use the Hadoop Distributed File System (HDFS; described here ) to
store data, it is not truly compatible with Hadoop and does not use MapReduce (described
here ) or YARN (described here ) to build indexes or respond to queries. There is a similar ef-
fort named Blur (described here ) to build a tool on top of the Lucene framework that lever-
ages the entire Hadoop stack.
Tutorial Links
Apart from the tutorial on the official Solr home page, there is a Solr wiki with great inform-
ation.
Example Code
In this example, we're going to assume we have a set of semi-structured data consisting of
movie reviews with labels that clearly mark the title and the text of the review. These reviews
will be stored in individual JSON files in the reviews directory.
We'll start by telling Solr to index our data; there are a handful of different ways to do this,
all with unique trade-offs. In this case, we're going to use the simplest mechanism, which is
the post.sh script located in the exampledocs/ subdirectory of our Solr install:
./example/exampledocs/post.sh /reviews/*.json
Once our reviews have been indexed, they are ready to search. Solr has its own graphical
user interface (GUI) that can be used for simple searches. We'll pull up that GUI and search
for movie reviews that contain the word “great”:
review_text:great&fl=title
This search tells Solr that we want to retrieve the title field ( fl=title ) for any review
where the word “great” appears in the review_text field.
Search WWH ::




Custom Search