Database Reference
In-Depth Information
chunks:
shard0000 2
shard0001 3
{ "testkey" : { "$minKey" : 1 } } -->> { "testkey" : 0 } on : shard0000 Timestamp(4, 0)
{ "testkey" : 0 } -->> { "testkey" : 14860 } on : shard0000 Timestamp(3, 1)
{ "testkey" : 14860 } -->> { "testkey" : 45477 } on : shard0001 Timestamp(4, 1)
{ "testkey" : 45477 } -->> { "testkey" : 76041 } on : shard0001 Timestamp(3, 4)
{ "testkey" : 76041 } -->> { "testkey" : { "$maxKey" : 1 } } on : shard0001
Timestamp(3, 5)
This output lists the shard servers, the configuration of each sharded database/collection, and each chunk in
the sharded dataset. Because you used a small chunkSize value to simulate a larger sharding setup, this report lists
a lot of chunks. An important piece of information that can be obtained from this listing is the range of sharding keys
associated with each chunk. The output also shows which shard server the specific chunks are stored on. You can use
the output returned by this command as the basis for a tool to analyze the distribution of a shard server's keys and
chunks. For example, you might use this data to determine whether there is any clumping of data in the dataset.
Using Replica Sets to Implement Shards
The examples you have seen so far rely on a single mongod instance to implement each shard. In Chapter 11, you learned how
to create replica sets, which are clusters of mongod instances working together to provide redundant and fail-safe storage.
When adding shards to the sharded cluster, you can provide the name of a replica set and the address of a member
of that replica set, and that shard will be instanced on each of the replica set members. Mongos will track which
instance is the primary server for the replica set; it will also make sure that all shard writes are made to that instance.
Combining sharding and replica sets enables you to create high-performance, highly reliable clusters that
can tolerate multi-machine failure. It also enables you to maximize the performance and availability of cheap,
commodity-class hardware.
the ability to use replica sets as a storage mechanism for shards satisfies “requirement 2: the ability to store
shard data in a fault-tolerant fashion.”
Note
The Balancer
We've previously discussed how MongoDB can automatically keep your workload distributed among all the shards
in your cluster. While you may think that this is done via some form of patented MongoDB-Magic, that's not the case.
Your MongoS process has an element within it called the balancer , which moves the logical chunks of data around
within your cluster to ensure that they are evenly distributed among all your shards. The balancer speaks to the
shards and tells them to migrate data from one shard to another. You can see the distribution of chunks within the sh.
status() output in the following example. You can see that my data is partitioned with two chunks on shard0000 and
three on shard0001 .
{ "_id" : "testdb", "partitioned" : true, "primary" : "shard0000" }
testdb.testcollection
shard key: { "testkey" : 1 }
chunks:
shard0000 2
shard0001 3
 
 
Search WWH ::




Custom Search