Databases Reference
In-Depth Information
Scaling Indexes
Suppose we have a collection of status messages from users. We want to query by user
and date to pull up all of a user's recent statuses. Using what we've learned so far, we
might create an index that looks like the following:
> db.status.ensureIndex({user : 1, date : -1})
This will make the query for user and date efficient, but it is not actually the best index
choice.
Imagine this as a book index again. We would have a list of documents sorted by user
and then subsorted by date, so it would look something like the following:
User 123 on March 13, 2010
User 123 on March 12, 2010
User 123 on March 11, 2010
User 123 on March 5, 2010
User 123 on March 4, 2010
User 124 on March 12, 2010
User 124 on March 11, 2010
...
This looks fine at this scale, but imagine if the application has millions of users who
have dozens of status updates per day. If the index entries for each user's status messages
take up a page's worth of space on disk, then for every “latest statuses” query, the
database will have to load a different page into memory. This will be very slow if the
site becomes popular enough that not all of the index fits into memory.
If we flip the index order to {date : -1, user : 1} , the database can keep the last
couple days of the index in memory, swap less, and thus query for the latest statuses
for any user much more quickly.
Thus, there are several questions to keep in mind when deciding what indexes to create:
1. What are the queries you are doing? Some of these keys will need to be indexed.
2. What is the correct direction for each key?
3. How is this going to scale? Is there a different ordering of keys that would keep
more of the frequently used portions of the index in memory?
If you can answer these questions, you are ready to index your data.
Indexing Keys in Embedded Documents
Indexes can be created on keys in embedded documents in the same way that they are
created on normal keys. For example, if we want to be able to search blog post com-
ments by date, we can create an index on the "date" key in the array of embedded
"comments" documents:
> db.blog.ensureIndex({"comments.date" : 1})
 
Search WWH ::




Custom Search