A database for the modern web - MongoDB in Action

Database Reference

In-Depth Information

It's possible to run the server without journaling as a way of increasing performance

for some write loads. The downside is that the data files may be corrupted after an

unclean shutdown. As a consequence, anyone planning to disable journaling must

run with replication, preferably to a second data center, to increase the likelihood that

a pristine copy of the data will still exist even if there's a failure.

The topics of replication and durability are vast; you'll see a detailed exploration of

them in chapter 8.

1.2.6

Scaling

The easiest way to scale most databases is to upgrade the hardware. If your application

is running on a single node, it's usually possible to add some combination of disk

IOPS , memory, and CPU to ease any database bottlenecks. The technique of augment-

ing a single node's hardware for scale is known as vertical scaling or scaling up . Vertical

scaling has the advantages of being simple, reliable, and cost-effective up to a certain

point. If you're running on virtualized hardware (such as Amazon's EC2 ), then you

may find that a sufficiently large instance isn't available. If you're running on physical

hardware, there may come a point where the cost of a more powerful server becomes

prohibitive.

It then makes sense to consider scaling horizontally , or scaling out . Instead of beefing

up a single node, scaling horizontally means distributing the database across multiple

machines. Because a horizontally scaled architecture can use commodity hardware,

the costs for hosting the total data set can be significantly reduced. What's more, the

the distribution of data across machines mitigates the consequences of failure.

Machines will unavoidably fail from time to time. If you've scaled vertically, and the

machine fails, then you need to deal with the failure of a machine upon which most of

your system depends. This may not be an issue if a copy of the data exists on a repli-

cated slave, but it's still the case that only a single server need fail to bring down the

entire system. Contrast that with failure inside a horizontally scaled architecture. This

may be less catastrophic since a single machine represents a much smaller percentage

of the system as a whole.

MongoDB has been designed to make horizontal scaling manageable. It does so

via a range-based partitioning mechanism, known as auto-sharding , which automati-

cally manages the distribution of data across nodes. The sharding system handles the

addition of shard nodes, and it also facilitates automatic failover. Individual shards are

made up of a replica set consisting of at least two nodes, 4 ensuring automatic recovery

with no single point of failure. All this means that no application code has to handle

these logistics; your application code communicates with a sharded cluster just as it

speaks to a single node.

We've covered a lot of MongoDB's most compelling features; in chapter 2, we'll

begin to see how some of these work in practice. But at this point, we're going to take

4

Technically, each replica set will have at least three nodes, but only two of these need carry a copy of the data.

Search WWH ::

Custom Search

Home