Optimization Tips - 50 Tips and Tricks for MongoDB Developers

Databases Reference

In-Depth Information

Use SSDs

SSDs (solid state drives) are much faster than spinning hard disks for many things,

but they are often smaller, more expensive, are difficult to securely erase, and still

do not come close to the speed at which you can read from memory. This isn't to

discourage you from using them: they usually work fantastically with MongoDB,

but they aren't a magical cure-all.

Add more RAM

Adding more RAM means you have to hit disk less. However, adding RAM will

only get you so far—at some point, your data isn't going to fit in RAM anymore.

So, the question becomes: how do we store terabytes (petabytes?) of data on disk, but

program an application that will mostly access data already in memory and move data

from disk to memory as infrequently as possible?

If you literally access all of your data randomly in real time, you're just going to need

a lot of RAM. However, most applications don't: recent data is accessed more than

older data, certain users are more active than others, certain regions have more cus-

tomers than others. Applications like these can be designed to keep certain documents

in memory and go to disk very infrequently.

Tip #22: Use indexes to do more with less memory

First, just so we're all on the same page, Figure 3-1 shows the sequence a read request

takes.

We'll assume, for this topic, that a page of memory is 4KB, although this is not uni-

versally true.

So, let's say you have a machine with 256GB of data and 16GB of memory. Let's say

most of this data is in one collection and you query this collection. What does Mon-

goDB do?

MongoDB loads the first page of documents from disk into memory, and compares

those to your query. Then it loads the next page and compares those. Then it loads the

next page. And so on, through 256GB of data. It can't take any shortcuts: it cannot

know if a document matches without looking at the document, so it must look at every

document. Thus, it will need to load all 256GB into memory (the OS takes care of

swapping the oldest pages out of memory as it needs room for new ones). This is going

to take a long, long time.

How can we avoid loading all 256GB into memory every time we do a query? We can

tell MongoDB to create an index on a given field, x , and MongoDB will create a tree of

the collection's values for that field. MongoDB basically preprocesses the data, adding

every x value in the collection to an ordered tree (see Figure 3-2 ). Each index entry in

the tree contains a value of x and a pointer to the document with that x value. The tree

Search WWH ::

Custom Search

Home