Database Reference
In-Depth Information
For writes, MongoDB will use only one core at a time. This is because of the global
write lock. Thus the only way to scale write load is ensure that writes aren't
I
/
O
-bound,
and from there scale horizontally with sharding. This is mitigated somewhat in
MongoDB v2.0 because generally writes won't take the lock around a page fault but
will instead allow another operation to complete. Still, a number of concurrency opti-
mizations are in the works. Among the possible options to be implemented are
collection-level locking and extent-based locking. Consult
JIRA
and the latest release
notes for the status of these improvements.
RAM
As with any database, MongoDB performs best with lots of
RAM
. Be sure to select hard-
ware (virtual or otherwise) with enough
RAM
to contain your frequently used indexes
plus your working data set. Then as your data grows, keep a close eye on the ratio of
RAM
to working set size. If you allow working set size to grow beyond
RAM
, you may
start to see significant performance degradation. Paging from disk in and of itself isn't
a problem, as it's a necessary step in loading data into memory. But if you're unhappy
with performance, excessive paging may be your problem. Chapter 7 discusses the
relationship between working set, index size, and
RAM
in great detail. At the end of
this chapter, you read about ways of identifying
RAM
deficiencies.
There are a few use cases where you can safely let data size grow well beyond avail-
able
RAM
, but they're the exception, not the rule. One example is using MongoDB as
an archive, where reads and writes seldom happen and where you don't need fast
responses. In this case, having as much
RAM
as data might be prohibitively expensive
with little benefit, since the application won't ever utilize so much
RAM
. For all data
sets, the key is testing. Test a representative prototype of your application to ensure
that you get the necessary baseline performance.
D
ISKS
When choosing disks, you need to consider
IOPS
(input/output operations per sec-
ond) and seek time. The differences between running on a single consumer-grade
hard drive, running in the cloud in a virtual disk (say,
EBS
), and running against a
high-performance
SAN
can't be overemphasized. Some applications will perform
acceptably against a single network-attached
EBS
volume, but demanding applications
will require something more.
Disk performance is important for a few reasons. The first is that, as you're writing
to MongoDB, the server by default will force a sync to disk every 60 seconds. This is
known as a
background flush
. With a write-intensive app and a slow disk, the back-
ground flushing may be slow enough to negatively affect overall system performance.
Second, a fast disk will allow you to warm up the server much more quickly. Any time
you need to restart a server, you also have to load your data set into
RAM
. This hap-
pens lazily; each successive read or write to MongoDB will load a new virtual memory
page into
RAM
until the physical memory is full. A fast disk will make this process
much faster, which will increase MongoDB's performance following a cold restart.
Finally, a fast disk can alter the required ratio of working set size to
RAM
for your