Database Reference
In-Depth Information
If you're going to issue this query in a production situation, you'll ideally have an
index on helpful_votes . If you want the review with the greatest number of helpful
votes for a particular product, you'll want a compound index on product_id and
helpful_votes . If the reason for this isn't clear, refer to chapter 7.
5.4.2
Distinct
MongoDB's distinct command is the simplest tool for getting a list of distinct values
for a particular key. The command works for both single keys and array keys.
distinct covers an entire collection by default but can be constrained with a query
selector.
You can use distinct to get a list of all the unique tags on the products collection
as follows:
db.products.distinct("tags")
It's that simple. If you want to operate on a subset of the products collection, you can
pass a query selector as the second argument. Here, the query limits the distinct tag
values to products in the Gardening Tools category:
db.products.distinct("tags",
{category_id: ObjectId("6a5b1476238d3b4dd5000048")})
AGGREGATION COMMAND LIMITATIONS For all their practicality, distinct and
group suffer from a significant limitation: they can't return a result set greater
than 16 MB . The 16 MB limit isn't a threshold imposed on these commands
per se but rather on all initial query result sets. distinct and group are imple-
mented as commands, which means that they're implemented as queries on
the special $cmd collection, and their being queries is what subjects them to
this limitation. If distinct or group can't handle your aggregation result size,
then you'll want to use map-reduce instead, where the results can be stored in
a collection rather than being returned inline.
5.4.3
Group
group , like distinct , is a database command, and thus its result set is subject to the
same 16 MB response limitation. Furthermore, to reduce memory consumption,
group won't process more than 10,000 unique keys. If your aggregate operation fits
within these bounds, then group can be a good choice because it's frequently faster
than map-reduce .
You've already seen a semi-involved example of grouping reviews by user. Let's
quickly review the options to be passed to group :
key —A document describing the fields to group by. For instance, to group by
category_id , you'd use {category_id: true} as your key. The key can also be
compound. For instance, if you wanted to group a series of posts by user_id
and rating , your key would look like this: {user_id: true, rating: true} .
The key option is required unless you're using keyf .
Search WWH ::




Custom Search