Database Reference
In-Depth Information
of shard key may depend on the precise workload of your deployment, but the choice of site-
page is likely to perform well and lead to a well balanced cluster for most deployments.
To enable sharding for the daily and statistics collections, we can execute the following com-
mands in the Python console:
>>>
>>> db . command ( 'shardcollection' , 'dbname.stats.daily' , {
... key : { 'metadata.site': 1, 'metadata.page' : 1 } })
{ "collectionsharded" : "dbname.stats.daily", "ok" : 1 }
>>>
>>> db . command ( 'shardcollection' , 'dbname.stats.monthly' , {
...
... key : { 'metadata.site' : 1 , 'metadata.page' : 1 } })
{ "collectionsharded" : "dbname.stats.monthly", "ok" : 1 }
One downside of the { metadata.site: 1, metadata.page: 1 } shard key is that if one
page dominates all your traffic, all updates to that page will go to a single shard. This is basic-
ally unavoidable, since all updates for a single page are going to a single document .
You may wish to include the date in addition to the site and page fields so that MongoDB
can split histories and serve different historical ranges with different shards. Note that this still
does not solve the problem; all updates to a page will still go to one chunk, but historical quer-
ies will scale better.
To enable the three-part shard key, we just update our shardcollection with the new key:
>>>
>>> db . command ( 'shardcollection' , 'dbname.stats.daily' , {
... 'key':{'metadata.site':1,'metadata.page':1,'metadata.date':1}})
{ "collectionsharded" : "dbname.stats.daily", "ok" : 1 }
>>>
>>> db . command ( 'shardcollection' , 'dbname.stats.monthly' , {
...
... 'key' :{ 'metadata.site' : 1 , 'metadata.page' : 1 , 'metadata.date' : 1 }})
{ "collectionsharded" : "dbname.stats.monthly", "ok" : 1 }
Hierarchical Aggregation
Although the techniques of Pre-Aggregated Reports can satisfy many operational intelligence
system needs, it's often desirable to calculate statistics at multiple levels of abstraction. This
section describes how we can use MongoDB's mapreduce command to convert transactional
data to statistics at multiple layers of abstraction.
Search WWH ::




Custom Search