Database Reference
In-Depth Information
sort —A sort to be applied to the query. This is most useful when used in con-
junction with the limit option. That way, you could run map-reduce on the
1,000 most-recently-created documents.
limit —An integer specifying a limit to be applied to the query and sort.
out —This parameter determines how the output is returned. To return all out-
put as the result of the command itself, pass {inline: 1} as the value. Note that
this works only when the result set fits within the 16 MB return limit.
The other option is to place the results into an output collection. To do this,
the value of out must be a string identifying the name of the collection where
the results are to be stored.
One problem with writing to an output collection is that you may overwrite
existing data if you've recently run a similar map-reduce. Therefore, two other
collection output options exist: one for merging the results with the old data
and another for reducing against the data. In the merge case, notated as
{merge: "collectionName"} , the new results will overwrite any existing items
having the same key. In the reduce case, {reduce: "collectionName"} , existing
keys' values will be reduced against new values using the reduce function. The
reduce output method is especially helpful for performing iterative map-
reduce, where you want to integrate new data into an existing aggregation.
When you run the new map-reduce against the collection, you simply add a
query selector to limit the data set over which the aggregation is run.
finalize —A JavaScript function to be applied to each resulting document
after the reduce phase is complete.
scope —A document that specifies values for variables to be globally accessible
by the map , reduce , and finalize functions.
verbose —A Boolean that, when true, will include in the command's return
document statistics on the execution time of the map-reduce job.
Alas, there's one important limitation to be aware of when thinking about MongoDB's
map-reduce and group : speed. On large data sets, these aggregation functions often
won't perform as quickly as some users may need. This can be blamed almost entirely
on the MongoDB's JavaScript engine. It's hard to achieve high performance with a
JavaScript engine that runs single-threaded and interpreted (not compiled).
But despair not. map-reduce and group are widely used and adequate in a lot of sit-
uations. For those cases when they're not, an alternative and a hope for the future
exist. The alternative is to run aggregations elsewhere. Users with especially large data
sets have experienced great success running the data through a Hadoop cluster. The
hope for the future is a newer set of aggregation functions that use compiled, multi-
threaded code. These are planned to be released some time after MongoDB v2.0; you
can track progress at https://jira.mongodb.org/browse/ SERVER -447 .
Search WWH ::




Custom Search