Database Reference
In-Depth Information
Aside: managing index size
One thing you should keep in mind when you're creating indexes is the size they take up in
RAM. When an index is accessed randomly, as in the case here with our index on path , the
entire index needs to be resident in RAM. In this particular case, the total number of distinct
paths is typically small in relation to the number of documents, which will limit the space that
the index requires.
To actually see the size of an index, you can use the collstats database command:
>>>
>>> db . command ( 'collstats' , 'events' )[ 'indexSizes' ]
There is actually another type of index that doesn't take up much RAM, and that's a right-
aligned index. Right-aligned refers to the access pattern of a regular index, not a special Mon-
goDB index type: in this case, most of the queries that use the index focus on the largest (or
smallest) values in the index, so most of the index is never actually used. This is often the case
with time-oriented data, where you tend to query documents from the recent past. In this case,
only a very thin “sliver” of the index is ever resident in RAM at a particular time, so index
size is of much less concern.
Finding all the events for a particular date
Another operation we might wish to do is to query the event log for all events that happened
on a particular date, perhaps as part of a security audit of suspicious activity. In this case, we'll
use a range query:
>>>
>>> q_events = db . events . find ( 'time' :
...
...
{ '$gte' : datetime ( 2000 , 10 , 10 ), '$lt' : datetime ( 2000 , 10 , 11 )})
This query selects documents from the events collection where the value of the time field
represents a date that is on or after (i.e., $gte ) 2000-10-10 but before (i.e., $lt ) 2000-10-11 .
Here, an index on the time field would optimize performance:
>>>
>>> db . events . ensure_index ( 'time' )
Note that this is a right-aligned index so long as our queries tend to focus on the recent history.
Search WWH ::




Custom Search