Database Reference
In-Depth Information
"t" : 1000,
"i" : 0
},
"ns" : "cloud-docs.spreadsheets",
"min" : {
"username" : { $minKey : 1 },
"_id" : { $minKey : 1 }
},
"max" : {
"username" : { $maxKey : 1 },
"_id" : { $maxKey : 1 }
},
"shard" : "shard-a"
}
Can you figure out what range this chunk represents? If there's just one chunk, then it
spans the entire sharded collection. That's borne out by the min and max fields, which
show that the chunk's range is bounded by $minKey and $maxKey .
MINKEY AND MAXKEY $minKey and $maxKey are used in comparison opera-
tions as the boundaries of BSON types. $minKey always compares lower than
all BSON types, and $maxKey compares greater than all BSON types. Because
the value for any given field can contain any BSON type, MongoDB uses
these two types to mark the chunk endpoints at the extremities of the
sharded collection.
You can see a more interesting chunk range by adding more data to the spreadsheets
collection. You'll use the Ruby script again, but this time you'll run 100 iterations,
which will insert an extra 20,000 documents totaling 100 MB :
$ ruby load.rb 100
Verify that the insert worked:
> db.spreadsheets.count()
20200
> db.spreadsheets.stats().size
103171828
Sample insert speed
Note that it may take several minutes to insert this data into the shard cluster. There
are three reasons for the slowness. First, you're performing a round trip for each
insert, whereas you might be able to perform bulk inserts in a production situation.
Second, you're inserting using Ruby, whose BSON serializer is slower than that of cer-
tain other drivers. Finally, and most significantly, you're running all of the shard's
nodes on a single machine. This places a huge burden on the disk, as four of your
nodes are being written to simultaneously (two replica set primaries, and two repli-
cating secondaries). Suffice it to say that in a proper production installation, this
insert would run much more quickly.
 
Search WWH ::




Custom Search