Database Reference
In-Depth Information
In addition to aggregating server statistics, you'll want to keep an eye on the distri-
bution of chunks and on individual chunk sizes. As you saw in the sample cluster, all of
this information is stored in the config database. If you ever detect unbalanced
chunks or unchecked chunk growth, you can use the
split
and
movechunk
com-
mands to address these issues. Alternatively, you can consult the logs to see whether
the balancing operation has halted for some reason.
M
ANUAL
PARTITIONING
There are a couple of cases where you may want to manually split and migrate chunks
on a live shard cluster. For example, as of MongoDB v2.0, the balancer doesn't directly
take into account the load on any one shard. Obviously, the more a shard is written to,
the larger its chunks become, and the more likely they are to eventually migrate. Nev-
ertheless, it's not hard to imagine situations where you'd be able to alleviate load on a
shard by migrating chunks. This is another situation where the
movechunk
command
can be helpful.
A
DDING
A
SHARD
If you've determined that you the need more capacity, you can add a new shard to an
existing cluster using the same method you used earlier:
sh.addShard("shard-c/rs1.example.net:27017,rs2.example.net:27017")
When adding capacity in this way, be realistic about how long it'll take to migrate data
to the new shard. As stated earlier, you can expect data to migrate at a rate of 100-200
MB
per minute. This means that if you need to add capacity to a sharded cluster, you
should do so long before performance starts to degrade. To determine when you
need to add a new shard, consider the rate at which your data set is growing. Obvi-
ously, you'll want to keep indexes and working set in
RAM
. So a good rule of thumb is
to plan to add a new shard at least several weeks before the indexes and working set
on your existing shards reach 90% of
RAM
.
If you're not willing to play it safe, as described here, then you open yourself up to
a world of pain. Once your indexes and working set don't fit in
RAM
, your application
can come to a halt, especially if the application demands high write and read through-
put. The problem is that the database will have to page to and from the disk, which
will slow reads and writes, backlogging operations that can't be served into a read/
write queue. At that point, adding capacity is difficult because migrating chunks
between shards adds read load to existing shards. Obviously, when a database is over-
loaded, the last thing you want to do is add load.
All of this is just to emphasize that you should monitor your cluster and add capac-
ity well before you need to.
R
EMOVING
A
SHARD
You may, in rare cases, want to remove a shard. You can do so using the
removeshard
command:
> use admin
> db.runCommand({removeshard: "shard-1/arete:30100,arete:30101"})