Database Reference
In-Depth Information
Aggregating Queries
As previously noted, MongoDB comes with a powerful set of aggregation tools (see Chapter 4 for more information
on these tools). You can use all these tools with the Python driver. These tools make it possible to using the count()
function to perform a count on your data; using the distinct() function to get a list of distinct values with no
duplicates; and, last but not least, use the map_reduce() function to group your data and batch-manipulate the results
or simply to perform counts.
This set of commands, used separately or together, enables you to query effectively for the information you need
to know—and nothing else.
Apart from these basic aggregation commands, the PyMongo driver also includes the aggregation framework.
This powerful feature will allow you to calculate aggregated values without needing to use the—often overly
complex—map/reduce (or MapReduce) framework.
Counting Items with count()
You can use the count() function if all you want is to count the total number of items matching your criteria. The function
doesn't return all the information the way the find() function does; instead, it returns an integer value with the total
of items found.
Let's look at some simple examples. We can begin by returning the total number of documents in the entire
collection, without specifying any criteria:
>>> collection.count()
3
You can also specify these count queries more precisely, as in this example:
>>> collection.find({"Status" : "In use", "Location.Owner" : "Walker, Jan"}).count()
1
The count() function can be great when all you need is a quick count of the total number of documents that
match your criteria.
Counting Unique Items with distinct()
The count() function is a great way to get the total number of items returned. However, sometimes you might
accidentally add duplicates to your collection because you simply forget to remove or change an old document, and
you want to get an accurate count that shows no duplicates. This is where the distinct() function can help you out.
This function ensures that only unique items will be returned. Let's set up an example by adding another item to the
collection, but with an ItemNumber used previously:
>>> dup = ( {
"ItemNumber" : "2345FDX",
"Status" : "Not used",
"Type" : "Laptop",
"Location" : {
"Department" : "Storage",
"Building" : "1A"
},
"Tags" : ["Not used","Laptop","Storage"]
} )
>>> collection.insert(dup)
ObjectId('4c592eb84abffe0e0c000004')
 
Search WWH ::




Custom Search