Advanced Queries - The Definitive Guide to MongoDB

Database Reference

In-Depth Information

Chapter 8

Advanced Queries

The chapters so far have covered most of the basic query mechanisms to find one or a series of documents by given

criteria. There are a number of mechanisms for finding given documents to bring them back to your application so

they can be processed. But sometimes these normal query mechanisms fall short and you want to perform complex

operations over most or all documents in your collection. Many developers, when queries or operations of this kind

are required, either iterate through all documents in the collection or write a series of queries to be executed in

sequence to perform the necessary calculations. Although this is a valid way of doing things, it can be burdensome to

write and maintain, as well as inefficient. It is for these reasons that MongoDB has some advanced query mechanics

that you can use to drive the most from your data. The advanced MongoDB features we'll examine in this chapter are

full-text search, the aggregation framework, and the MapReduce framework.

Full text search is one of the most-requested features to be added to MongoDB -. It represents the ability to

create specialized text indexes in MongoDB and then perform text searches on those indexes to locate documents

that contain matching text elements. The MongoDB full text search feature goes beyond simple string matching to

include a full-stemmed approach based on the language you have selected for your documents, and it is an incredibly

powerful tool for performing language queries on your documents. This recently introduced feature is marked as

“experimental” in the 2.4 releases of MongoDB, because the development team is still working hard to improve it,

which means you must manually activate it for use in your MongoDB environment.

The second feature this chapter will cover is the MongoDB aggregation framework. Introduced in chapters 4

and 6, this feature provides a whole host of query features that let you iterate over selected documents, or all of them,

gathering or manipulating information. These query functions are then arranged into a pipeline of operations which

are performed one after another on your collection to gather information from your queries.

The third and final feature we will cover is called MapReduce, which will sound familiar to those of you who have

worked with Hadoop. MapReduce is a powerful mechanism that makes use of MongoDB's built-in JavaScript engine

to perform abstract code executions in real time. It is an incredibly powerful tool that uses two JavaScript functions,

one to map your data and another to transform and pull information out from the mapped data.

Probably the most important thing to remember throughout this chapter is that these are truly advanced features,

and it is possible to cause serious performance problems for your MongoDB nodes if they are misused, so whenever

possible you should test any of these features in a testing environment before deploying them to important systems.

Text Search

MongoDB's text search works by first creating a full text index and specifying the fields that you wish to be indexed to

facilitate text searching. This text index will go over every document in your collection and tokenize and stem each

string of text. This process of tokenizing and stemming involves breaking down the text into tokens, which conceptually

are close to words. MongoDB then stems each token to find the root concept for the token. For example, suppose that

breaking down a string reaches the token fishing . This token is then stemmed back to the root word fish , so MongoDB

creates an index entry of fish for that document. This same process of tokenizing and stemming is applied to the search

parameters a user enters to perform a given text search. The parameters are then compared against each document,

and a relevance score is calculated. The documents are then returned to the user based on their score.

Search WWH ::

Custom Search

Home