Database Reference
In-Depth Information
This leads to some unfortunate circumstances where our index cannot be used optimally:
▪ Whenever we have a range query on two or more properties, they cannot both be used ef-
fectively in the index.
▪ Whenever we have a range query combined with a sort on a different property, the index
is somewhat less efficient than when doing a range and sort on the same property set.
In such cases, the best approach is to test with representative data, making liberal use of ex-
plain() . If you discover that the MongoDB query optimizer is making a bad choice of index
(perhaps choosing to reduce the number of entries scanned at the expense of doing a large in-
memory sort, for instance), you can also use the hint() method to tell it which index to use.
Counting requests by day and page
Finding requests is all well and good, but more frequently we need to count requests, or per-
form some other aggregate operation on them during analysis. Here, we'll describe how you
can use MongoDB's aggregation framework , introduced in version 2.1, to select, process, and
aggregate results from a large number of documents for powerful ad hoc queries. In this case,
we'll count the number of requests per resource (i.e., page) per day in the last month.
To use the aggregation framework, we need to set up a pipeline of operations. In this case, our
pipeline looks like Figure 4-1 and is implemented by the database command shown here:
>>>
>>> result = db . command ( 'aggregate' , 'events' , pipeline = [
...
...
{ '$match' : {
...
'time' : {
...
'$gte' : datetime ( 2000 , 10 , 1 ),
...
'$lt' : datetime ( 2000 , 11 , 1 ) } } },
...
{ '$project' : {
...
'path' : 1 ,
...
'date' : {
...
'y' : { '$year' : '$time' },
...
'm' : { '$month' : '$time' },
...
'd' : { '$dayOfMonth' : '$time' } } } },
...
{ '$group' : {
...
'_id' : {
...
'p' : '$path' ,
...
'y' : '$date.y' ,
...
'm' : '$date.m' ,
...
'd' : '$date.d' },
...
'hits' : { '$sum' : 1 } } },
...
])
 
 
 
Search WWH ::




Custom Search