Database Reference
In-Depth Information
{ "_id" : "balancer", "process" : "arete:40000:1299516887:1804289383",
"state" : 1,
"ts" : ObjectId("4d890d30bd9f205b29eda79e"),
"when" : ISODate("2011-03-22T20:57:20.249Z"),
"who" : "arete:40000:1299516887:1804289383:Balancer:846930886",
"why" : "doing balance round"
}
Any state greater than 0 indicates that balancing is happening. The process field
shows the host name and port of the computer running the mongos that's orchestrat-
ing the balancing round. In this case, the host is arete:40000 . If balancing ever fails
to stop after you modify the settings collection, you should examine the logs from the
balancing mongos for errors.
Once you know that the balancer has stopped, it's safe to run your backups. After
taking your backups, don't forget to restart the balancer. You can do so by resetting
the stopped value:
> use config
> db.settings.update({_id: "balancer"}, {$set: {stopped: false}}, true);
To si m p l if y s o m e o f t h e s e o pe ra t io n s w it h t h e b a l a n c e r, M o n g o D B v 2 . 0 h a s i n t r o d u c e d
a couple shell helpers. For example, you can start and stop the balancer with sh.set-
BalancerState() :
> sh.setBalancerState(false)
This is equivalent to adjusting the stopped value in the settings collection. Once
you've disabled the balancer in this way, you make repeated calls to sh.isBalancer-
Running() until the balancer stops.
F AILOVER AND RECOVERY
Although we've covered general replica set failures, it's also important to note a
sharded cluster's potential points of failure along with best practices for recovery.
Failure of a shard member
Each shard consists of a replica set. Thus if any member of one of these replica sets
fails, a secondary member will be elected primary, and the mongos process will auto-
matically connect to it. Chapter 8 describes the specific steps to take in restoring a
failed replica set member. The method you choose depends on how the member has
failed, but regardless, the instructions are the same whether the replica set is part of a
sharded cluster or not.
If you see anomalous behavior after a replica set failover, you can reset the system
by restarting all mongos processes. This will ensure proper connections to the new rep-
lica sets. In addition, if you notice that balancing isn't working, you should check the
config database's locks collection for entries whose process fields point to former
primary nodes. If you see such an entry, the lock document is stale, and you're safe
manually deleting it.
Search WWH ::




Custom Search