Sharding - MongoDB in Action

Database Reference

In-Depth Information

Data center failures

The most likely data center failure is a power outage. When not running your

MongoDB servers without journaling enabled, a power outage will mean unclean

shutdowns of the MongoDB servers, potentially corrupting the data files. The only

reliable recovery from such a failure is a database repair, a lengthy process that guar-

antees downtime.

Most users will run their entire cluster within a single data center only, and this is

fine for plenty of applications. The main precaution to take in this case is to ensure

that at least one node from each shard, and one config server, is running with jour-

naling enabled. This will greatly expedite recovery once power is restored. Journaling

is covered in chapter 10.

Still, some failures are more severe. Power outages can sometimes last for days.

Floods, earthquakes, and other natural disasters can physically destroy a data cen-

ter. Users who want to be able to recover quickly from these kinds of failure must

deploy their shard clusters across multiple data centers.

The decision about which sharding topology is best for your application should always

be based on serious considerations about how much downtime you can tolerate, as

measured by your mean time to recovery ( MTR ). Think about the potential failure sce-

narios and simulate them. Consider the consequences for your application (and busi-

ness) if a data center should fail.

C ONFIGURATION NOTES

The following are a few useful notes on configuring a sharded cluster.

Estimating cluster size

Users frequently want to know how many shards to deploy and how large each shard

should be. The answer, of course, depends on the circumstances. If you're deploying

on Amazon's EC2 , you shouldn't shard until you've maxed out the largest available

instances. At the time of this writing, the largest EC2 nodes have 68 GB of RAM . If

you're running on your own hardware, you can easily go larger. It wouldn't be unrea-

sonable to wait until you've reached 100 GB of data before going to shard.

Naturally, each additional shard introduces extra complexity, and each shard also

requires replicas. Thus it's better to have a small number of large shards than a large

number of small ones.

Sharding an existing collection

You can shard existing collections, but don't be surprised if it takes some time to dis-

tribute the data across shards. Only one balancing round can happen at a time, and

the migrations will move only around 100-200 MB of data per minute. Thus, sharding

a 50 GB collection will take around eight hours, and this will likely involve some mod-

erate disk activity. In addition, when you initially shard a large collection like this, you

may have to split manually to expedite the sharding process, since splitting is trig-

gered by inserts.

Search WWH ::

Custom Search

Home