Database Reference
In-Depth Information
The sharding system uses the shard key to map data into chunks , which are logical contiguous ranges of
document keys (see Chapter 5 for more information on chunks). Each chunk identifies a number of documents with a
particular continuous range of sharding key values; these values enable the mongos controller to quickly find a chunk
that contains a document it needs to work on. MongoDB's sharding system then stores this chunk on an available
shard store; the config servers keep track of which chunk is stored on which shard server. This is an important feature
of the implementation because it allows you to add and remove shards from a cluster without having to back up and
restore the data.
When you add a new shard to the cluster, the system will migrate a number of chunks across the new set of
servers in order to distribute them evenly. Similarly, when you remove a shard, the sharding controller will drain the
chunks out of the shard being taken offline and redistribute them to the remaining shard servers.
A sharding setup for MongoDB also needs a place to store the configuration of its shards, as well as a place to
store information about each shard server in the cluster. To support this, a MongoDB server called a config server is
required; this server instance is a mongod server running in a special role. As explained earlier, the config servers also
act as directories that allow the location of each chunk to be determined. You can have either one (development)
or three (production) config servers in your cluster. It is always recommended to run with three config servers in
production, as the loss of your config server will mean you can no longer determine which parts of your sharded data
are on which shards!
At first glance, it appears that implementing a solution that relies on sharding requires a lot of servers! However,
you can co-host multiple instances of each of the different services required to create a sharding setup on a relatively
small number of physical servers (similar to what you saw in Chapter 11's coverage of replication), but you will need to
implement strict resource management to avoid having MongoDB processes compete with each other for things like
RAM. Figure 12-3 shows a fully redundant sharding system that uses replica sets for the shard storage and the config
servers, as well as a set of mongos to manage the cluster. It also shows how those services can be condensed to run on
just three physical servers.
mongod-
Shard Store
mongod-
Config Server
mongod-
Shard Store
mongod-
Config Servers
mongos-
Shard Server
mongod-
Shard Store
Replica Set-
Shard 01
mongod-
Shard Store
mongod-
Config Server
mongod-
Shard Store
mongos-
Shard Server
Replica Set-
Shard 02
mongod-
Shard Store
Client
mongos-
Shard Controllers
Client
mongod-
Shard Store
mongod-
Config Server
Replica Set-
Shard 03
mongod-
Shard Store
mongos-
Shard Server
mongod-
Shard Store
Figure 12-3. A redundant sharding configuration
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Search WWH ::




Custom Search