Database Reference
In-Depth Information
Sharding Concerns
If you need to shard the data for this system, the _id field is a reasonable choice for shard key
since most updates use the _id field in their spec, allowing mongos to route each update to a
single mongod process. There are a couple of potential drawbacks with using _id , however:
▪ If the cart collection's _id is an increasing value such as an ObjectId() , all new carts end
up on a single shard.
▪ Cart expiration and inventory adjustment require update operations and queries to broad-
cast to all shards when using _id as a shard key.
It's possible to mitigate the first problem at least by using a pseudorandom value for _id when
creating a cart. A reasonable approach would be the following:
import
import hashlib
hashlib
import
import bson
bson
def
def new_cart ():
object_id = bson . ObjectId ()
cart_id = hashlib . md5 ( str ( object_id )) . hexdigest ()
return
return cart_id
We're creating a bson.ObjectId() to get a unique value to use in our hash. Note
that since ObjectId uses the current timestamp as its most significant bits, it's not
an appropriate choice for shard key.
Now we randomize the object_id , creating a string that is extremely likely to be
unique in our system.
To actually perform the sharding, we'd execute the following commands:
>>>
>>> db . command ( 'shardcollection' , 'dbname.inventory'
...
... 'key' : { '_id' : 1 } )
{ "collectionsharded" : "dbname.inventory", "ok" : 1 }
>>>
>>> db . command ( 'shardcollection' , 'dbname.cart' )
...
... 'key' : { '_id' : 1 } )
{ "collectionsharded" : "dbname.cart", "ok" : 1 }
 
 
 
Search WWH ::




Custom Search