Database Reference
In-Depth Information
The
reduce
function will be passed a key and an array of one or more values. Your job
in writing a reduce function is to make sure that those values are aggregated together
in the desired way and then returned as a single value. Because of
map-reduce
's itera-
tive nature,
reduce
may be invoked more than once, and your code must take this into
account. All this means in practice is that the value returned by the
reduce
function
must be identical in form to the value emitted by the map function. Look closely and
you'll see that this is the case.
The shell's
map-reduce
method requires a map and a
reduce
function as argu-
ments. But this example adds two more. The first is a query filter, which limits the doc-
uments involved in the aggregation to orders made since the beginning of 2010. The
second argument is the name of the output collection:
filter = {purchase_date: {$gte: new Date(2010, 0, 1)}}
db.orders.mapReduce(map, reduce, {query: filter, out: 'totals'})
The results are stored in a collection called
totals
, and you can query this collec-
tion like you do any other. The following listing displays the results of querying one
of these collections. The
_id
field holds your grouping key and the year and month,
and the
value
field references the reduced totals.
Listing 5.3
Querying the
map-reduce
output collection
> db.totals.find()
{ _id: "1-2011", value: { total: 32002300, items: 59 }}
{ _id: "2-2011", value: { total: 45439500, items: 71 }}
{ _id: "3-2011", value: { total: 54322300, items: 98 }}
{ _id: "4-2011", value: { total: 75534200, items: 115 }}
{ _id: "5-2011", value: { total: 81232100, items: 121 }}
The examples here should give you some sense of MongoDB's aggregation capabili-
ties in practice. In the next section, we'll cover most of the hairy details.
5.4
Aggregation in detail
Here I'll provide some extra details on MongoDB's aggregation functions.
5.4.1
Maxima and minima
You'll commonly need to find min and max values for a given value in a collection.
Databases using
SQL
provide special
min()
and
max()
functions, but MongoDB
doesn't. Instead, you must specify these queries literally. To find a maximum value,
you can sort descending by the desired value and limit the result set to just one docu-
ment. You can get a corresponding minimum by reversing the sort. For example, if
you wanted to find the review with the greatest number of helpful votes, your query
would need to sort by that value and limit by one:
db.reviews.find({}).sort({helpful_votes: -1}).limit(1)
The
helpful_votes
field in the returned document will contain the maximum value
for that field. To get the minimum value, just reverse the sort order:
db.reviews.find({}).sort({helpful_votes: 1}).limit(1)