Databases Reference
In-Depth Information
NoSQL systems combine document store concepts with full-text indexing solu-
tions, which results in high-quality search solutions and produces results with better
search quality. Understanding why NoSQL search results are superior will help you
evaluate the merits of these systems.
In this chapter, we'll show you how NoSQL databases can be used to build high-
quality and cost-effective search solutions, and help you understand how findability
impacts NoSQL system selection. We'll start this chapter with definitions of search
terms, and then introduce some more complex concepts used in search technologies.
Later, we'll look at three case studies that show how reverse indexes are created and
how search is applied in technical documentation and reporting.
7.1
What is NoSQL search?
For our purposes, we'll define search as finding an item of interest in your NoSQL
database when you have partial information about an item. For example, you may
know some of the keywords in a document, but not know the document title, author,
or date of creation.
Search technologies apply to highly structured records similar to those in an
RDBMS as well as “unstructured” plain-text documents that contain words, sentences,
and paragraphs. There are also a large number of documents that fall somewhere in
the middle called semi-structured data .
Search is one of the most important tools to help increase the productivity of
knowledge workers. Studies show that finding the right document quickly can save
hours of time each day. Companies such as Google and Yahoo!, pioneers in the use of
NoSQL systems, were driven by the problems involved in document search and
retrieval. Before we begin looking at how NoSQL systems can be used to create search
solutions, let's define some terms used when building search applications.
7.2
Types of search
As you're building applications, you'll come to the point where building and provid-
ing search will be important to your users. So let's look at the types of search that you
could provide: Boolean search used in RDBMS s, full-text keyword search used in
frameworks such as Apache Lucene, and structured search popular in NoSQL systems
that use XML or JSON type documents.
7.2.1
Comparing Boolean, full-text keyword,
and structured search models
If you've used RDBMS s, you might be familiar with creating search programs that look
for specific records in a database. You might also have used tools such as Apache
Lucene and Apache Solr to find specific documents using full-text keyword search. In
this section, we'll introduce a new type of search: structured search. Structured search
combines features from both Boolean and full-text keyword search. To get us started,
table 7.1 compares the three main search types.
Search WWH ::




Custom Search