Databases Reference
In-Depth Information
{ "_id" : ..., "username" : "joe", "age" : 27, "user_id" : 7 }
{ "_id" : ..., "username" : "john", "age" : 36, "user_id" : 2 }
{ "_id" : ..., "username" : "john", "age" : 18, "user_id" : 3 }
{ "_id" : ..., "username" : "john", "age" : 7, "user_id" : 5 }
{ "_id" : ..., "username" : "sally", "age" : 52, "user_id" : 9 }
{ "_id" : ..., "username" : "simon", "age" : 59, "user_id" : 10 }
{ "_id" : ..., "username" : "simon", "age" : 3, "user_id" : 6 }
{ "_id" : ..., "username" : "smith", "age" : 48, "user_id" : 0 }
{ "_id" : ..., "username" : "smith", "age" : 30, "user_id" : 1 }
The usernames are in strictly increasing alphabetical order, and within each name group
the ages are in decreasing order. This optimizes sorting by {"username" : 1, "age" :
-1} but is less efficient at sorting by {"username" : 1, "age" : 1} . If we wanted to
optimize {"username" : 1, "age" : 1} , we would create an index on {"username" : 1,
"age" : 1} to organize ages in ascending order.
The index on username and age also makes queries on username fast. In general, if an
index has N keys, it will make queries on any prefix of those keys fast. For instance, if
we have an index that looks like {"a" : 1, "b" : 1, "c" : 1, ..., "z" : 1} , we
effectively have an index on {"a" : 1} , {"a" : 1, "b" : 1} , {"a" : 1, "b" : 1, "c" :
1} , and so on. Queries that would use the index {"b" : 1} , {"a" : 1, "c" :1} , and so
on will not be optimized: only queries that can use a prefix of the index can take ad-
vantage of it.
The MongoDB query optimizer will reorder query terms to take advantage of indexes:
if you query for {"x" : "foo", "y" : "bar"} and you have an index on {"y" : 1, "x" :
1} , MongoDB will figure it out.
The disadvantage to creating an index is that it puts a little bit of overhead on every
insert, update, and remove. This is because the database not only needs to do the
operation but also needs to make a note of it in any indexes on the collection. Thus,
the absolute minimum number of indexes should be created. There is a built-in max-
imum of 64 indexes per collection, which is more than almost any application should
need.
Do not index every key. This will make inserts slow, take up lots of
space, and probably not speed up your queries very much. Figure out
what queries you are running, what the best indexes are for these quer-
ies, and make sure that the server is using the indexes you've created
using the explain and hint tools described in the next section.
Sometimes the most efficient solution is actually not to use an index. In general, if a
query is returning a half or more of the collection, it will be more efficient for the
database to just do a table scan instead of having to look up the index and then the
value for almost every single document. Thus, for queries such as checking whether a
key exists or determining whether a boolean value is true or false, it may actually be
better to not use an index at all.
 
Search WWH ::




Custom Search