Database Reference
In-Depth Information
your driver in case you ever need to examine what's being sent to the database. For
instance, when demonstrating capped collections, it was reasonable to assume that the
sample document size was roughly 100 bytes. You can check this assumption using the
Ruby driver's
BSON
serializer:
doc = {
:_id => BSON::ObjectId.new,
:username => "kbanker",
:action_code => rand(5),
:time => Time.now.utc,
:n => 1
}
bson = BSON::BSON_CODER.serialize(doc)
puts "Document #{doc.inspect} takes up #{bson.length} bytes as BSON"
The
serialize
method returns a byte array. If you run the preceding code, you'll get
a
BSON
object 82 bytes long, which isn't far from the estimate. If you ever want to
check the
BSON
size of an object using the shell, that's also straightforward:
> doc = {
_id: new ObjectId(),
username: "kbanker",
action_code: Math.ceil(Math.random() * 5),
time: new Date(),
n: 1
}
> Object.bsonsize(doc);
82
Again, you get 82 bytes. The difference between the 82-byte document size and the
100-byte estimate is due to normal collection and document overhead.
Deserializing
BSON
is just as straightforward. Try running this code to verify that it
works:
deserialized_doc = BSON::BSON_CODER.deserialize(bson)
puts "Here's our document deserialized from BSON:"
puts deserialized_doc.inspect
Do note that you can't serialize just any Ruby hash. To serialize without error, the key
names must be valid, and each of the values must be convertible into a
BSON
type. A
valid key name consists of a null-terminated string with a maximum length of 255
bytes. The string may consist of any combination of
ASCII
characters, with three
exceptions: it can't begin with a
$
, it must not contain any
.
characters, and it must
not contain the null byte except in the final position. When programming in Ruby,
you may use symbols as hash keys, but they'll be converted into their string equivalents
when serialized.
It's important to consider the length of the key names you choose, since key names
are stored in the documents themselves. This contrasts with an
RDBMS
, where column
names are always kept separate from the rows they refer to. So when using
BSON
, if you