Database Reference
In-Depth Information
Data center failures
The most likely data center failure is a power outage. When not running your
MongoDB servers without journaling enabled, a power outage will mean unclean
shutdowns of the MongoDB servers, potentially corrupting the data files. The only
reliable recovery from such a failure is a database repair, a lengthy process that guar-
antees downtime.
Most users will run their entire cluster within a single data center only, and this is
fine for plenty of applications. The main precaution to take in this case is to ensure
that at least one node from each shard, and one config server, is running with jour-
naling enabled. This will greatly expedite recovery once power is restored. Journaling
is covered in chapter 10.
Still, some failures are more severe. Power outages can sometimes last for days.
Floods, earthquakes, and other natural disasters can physically destroy a data cen-
ter. Users who want to be able to recover quickly from these kinds of failure must
deploy their shard clusters across multiple data centers.
The decision about which sharding topology is best for your application should always
be based on serious considerations about how much downtime you can tolerate, as
measured by your mean time to recovery ( MTR ). Think about the potential failure sce-
narios and simulate them. Consider the consequences for your application (and busi-
ness) if a data center should fail.
C ONFIGURATION NOTES
The following are a few useful notes on configuring a sharded cluster.
Estimating cluster size
Users frequently want to know how many shards to deploy and how large each shard
should be. The answer, of course, depends on the circumstances. If you're deploying
on Amazon's EC2 , you shouldn't shard until you've maxed out the largest available
instances. At the time of this writing, the largest EC2 nodes have 68 GB of RAM . If
you're running on your own hardware, you can easily go larger. It wouldn't be unrea-
sonable to wait until you've reached 100 GB of data before going to shard.
Naturally, each additional shard introduces extra complexity, and each shard also
requires replicas. Thus it's better to have a small number of large shards than a large
number of small ones.
Sharding an existing collection
You can shard existing collections, but don't be surprised if it takes some time to dis-
tribute the data across shards. Only one balancing round can happen at a time, and
the migrations will move only around 100-200 MB of data per minute. Thus, sharding
a 50 GB collection will take around eight hours, and this will likely involve some mod-
erate disk activity. In addition, when you initially shard a large collection like this, you
may have to split manually to expedite the sharding process, since splitting is trig-
gered by inserts.
 
Search WWH ::




Custom Search