Database Reference
In-Depth Information
REGULAR EXPRESSION PITFALLS
If you use the re.IGNORECASE flag, you're basically back where you were, since the in-
dexesarecreatedascase-sensitive.Ifyouwantcase-insensitivesearch,it'stypicallyagood
idea to store the data you want to search on in all-lowercase or all-uppercase format.
If for some reason you don't want to use a compiled regular expression, MongoDB provides
a special syntax for regular expression queries using plain Python dict objects:
query = db . products . find ({
'type' : 'Film' ,
'title' : { '$regex' : '.*hacker.*' , '$options' : 'i' }})
query = query . sort ([( 'details.issue_date' , - 1 )])
The indexing strategy for these kinds of queries is different from previous attempts. Here, cre-
ate an index on { type: 1, details.issue_date: -1, title: 1 } using the following
Python console:
>>> db . products . ensure_index ([
...
( 'type' , 1 ),
...
( 'details.issue_date' , - 1 ),
...
( 'title' , 1 )])
This index makes it possible to avoid scanning whole documents by using the index for scan-
ning the title rather than forcing MongoDB to scan whole documents for the title field. Addi-
tionally, to support the sort on the details.issue_date field, by placing this field before the
title field, ensures that the result set is already ordered before MongoDB filters title field.
Conclusion: Index all the things!
In ecommerce systems, we typically don't know exactly what the user will be filtering on, so
it's a good idea to create a number of indexes on queries that are likely to happen. Although
such indexing will slow down updates, a product catalog is only very infrequently updated,
so this drawback is justified by the significant improvements in search speed. To sum up, if
your application has a code path to execute a query, there should be an index to accelerate that
query.
Search WWH ::




Custom Search