Database Reference
In-Depth Information
S KIP AND LIMIT
There's nothing mysterious about the semantics of skip and limit . These query
options should always work as you expect.
But you should beware of passing large values (say, values greater than 10,000) for
skip because serving such queries requires scanning over a number of documents
equal to the skip value. For example, imagine that you're paginating a million docu-
ments sorted by date, descending, with 10 results per page. This means that the query
to display the 50,000th page will include a skip value of 500,000, which is incredibly
inefficient. A better strategy is to omit the skip altogether and instead add a range con-
dition to the query that indicates where the next result set begins. Thus, this query
db.docs.find({}).skip(500000).limit(10).sort({date: -1})
becomes this:
db.docs.find({date: {$gt: previous_page_date}}).limit(10).sort({date: -1})
This second query will scan far fewer items than the first. The only potential problem
is that if date isn't unique for each document, the same document may be displayed
more than once. There are many strategies for dealing with this, but the solutions are
left as exercises for the reader.
5.3
Aggregating orders
You've already seen a basic example of MongoDB's aggregation in the count com-
mand, which you used for pagination. Most databases provide count plus a lot of
other built-in aggregation functions for calculating sums, averages, variances, and the
like. These features are on the MongoDB roadmap, but until they're implemented,
you can use group and map-reduce to script any aggregate function, from simple sums
to standard deviations.
5.3.1
Grouping reviews by user
It's common to want to know which users provide the most valuable reviews. Since the
application allows users to votes on reviews, it's technically possible to calculate the
total number of votes for all of a user's reviews along with the average number of votes
a user receives per review. Though you could get these stats by querying all reviews
and doing some basic client-side processing, you can also use MongoDB's group com-
mand to get the result from the server.
group takes a minimum of three arguments. The first, key , defines how your data
will be grouped. In this case, you want your results to be grouped by user, so your
grouping key is user_id . The second argument, known as the reduce function, is a
JavaScript function that aggregates over a result set. The final argument to group is an
initial document for the reduce function.
This sounds more complicated than it is. To see why, let's look more closely at the
initial document you'll use and at its corresponding reduce function:
initial = {review: 0, votes: 0};
reduce = function(doc, aggregator) {
Search WWH ::




Custom Search