Database Reference
In-Depth Information
Option 4: Shard by combining a natural and synthetic key
MongoDB supports compound shard keys that combine the best aspects of options 2 and 3. In
these situations, the shard key would resemble { path: 1 , ssk: 1 } , where path is an
often-used natural key or value from your data and ssk is a hash of the _id field.
Using this type of shard key, data is largely distributed by the natural key, or path , which
makes most queries that access the path field local to a single shard or group of shards. At the
same time, if there is not sufficient distribution for specific values of path , the ssk makes it
possible for MongoDB to create chunks that distribute data across the cluster.
In most situations, these kinds of keys provide the ideal balance between distributing writes
across the cluster and ensuring that most queries will only need to access a select number of
shards.
Test with your own data
Selecting shard keys is difficult because there are no definitive “best practices,” the decision
has a large impact on performance, and it is difficult or impossible to change the shard key
after making the selection.
This section provides a good starting point for thinking about shard key selection. Neverthe-
less, the best way to select a shard key is to analyze the actual insertions and queries from
your own application.
Although the details are beyond our scope here, you may also consider pre-splitting your
chunks if your application has a very high and predictable insert pattern. In this case, you cre-
ate empty chunks and manually pre-distribute them among your shard servers. Again, the best
solution is to test with your own data.
Managing Event Data Growth
Without some strategy for managing the size of your database, an event logging system will
grow indefinitely. This is particularly important in the context of MongoDB since MongoDB,
as of the writing of this topic, does not relinquish data to the filesystem, even when data gets
removed from the database (i.e., the data files for your database will never shrink on disk).
This section describes a few strategies to consider when managing event data growth.
Search WWH ::




Custom Search