Database Reference
In-Depth Information
Finally, you want to make sure that you can add or remove servers from the set of shards without having to back
up and restore the data and
redistribute
it across a smaller or larger set of shards. Further, you need to be able to do
this without causing any down time on the cluster. Let's call this
Requirement 3: The ability to add or remove shards
while the system is running
.
The upcoming sections will cover how to address these requirements.
Implementing Sharding with MongoDB
MongoDB uses a
proxy
mechanism to support sharding (see Figure
12-2
); the provided
mongos
daemon acts as a
controller
for multiple
mongod
-based shard servers. Your application attaches to the
mongos
process as though it were
a single MongoDB database server; thereafter, your application sends all of its commands (such as updates, queries,
and deletes) to that
mongos
process.
mongod - Config Server
mongod - Shard 01
Client
mongos - Shard Controller
mongod - Shard 02
Figure 12-2.
A simple sharding setup without redundancy
The
mongos
process is responsible for managing which MongoDB server is sent the commands from your
application, and this daemon will reissue queries that cross multiple shards to multiple servers and aggregate the
results together.
MongoDB implements sharding at the collection level, not the database level. In many systems, only one or two
collections may grow to the point where sharding is required. Thus, sharding should be used judiciously; you don't
want to impose the overhead of managing the distribution of data for smaller collections if you don't need to.
Let's return to the fictitious Gaelic social network example. In this application, the
user
collection contains
details about its users and their profiles. This collection is likely to grow to the point where it needs to be sharded.
However, other collections, such as
events
,
countries
, and
states
, are unlikely to ever become so large that sharding
would provide any benefit.