Databases Reference
In-Depth Information
Search region threshold
Keyword 1
Your search keyword
Search score is distance
measurement
Figure 7.1 Vector search is a
way to find documents that are
closest to a keyword. By
counting the number of
keywords per page, you can
rank all documents by a keyword
space dimension.
Other documents
Keyword 2
As you might guess, calculating search vectors is complex. Luckily, vector dis-
tance calculations are included in most full-text search systems. Once your full-
text indexes have been created, the job of building a search engine can be as
easy as combining your query with a search system query function.
Vector search is one of the key technologies that allow users to perform fuzzy
searches. They help you find inexact matches to documents that are “in the
neighborhood” of your query keywords. Vector search tools also allow you to
treat entire documents as a keyword collection for additional searches. This fea-
ture allows search systems to add functions such as “find similar documents” to
an individual document.
N-gram search —N-gram search is the process of breaking long strings into short,
fixed-length strings (typically three characters long) and indexing these strings
for exact match searches that may include whitespace characters. N-gram
indexes can take up a large amount of disk space, but are the only way to
quickly search some types of text such as software source code (where all char-
acters including spaces may be important). N-gram indexes are also used for
finding patterns in long strings of text such as DNA sequences.
Although there are clearly many types of searches, there are also many tools that make
these searches fast. As we move to our next section, you'll see how NoSQL systems are
able to find and retrieve your requested information rapidly.
7.3
Strategies and methods that make
NoSQL search effective
So how are NoSQL systems able to take your requested search information and return
the results so fast? Let's take a look at the strategies and methods that make NoSQL
search systems so effective:
Range index —A range index is a way of indexing all database element values in
increasing order. Range indexes are ideal for alphabetical keywords, dates,
timestamps, or amounts where you might want to find all items equal to a spe-
cific value or between two values. Range indexes can be created on any data
Search WWH ::




Custom Search