Database Reference
In-Depth Information
"totalSize" => 218103808,
"ok" => true
}
Once you get used to representing documents as Ruby hashes, the transition from the
shell
API
is almost seamless. It's okay if you're still feeling shaky about using MongoDB
with Ruby; you'll get more practice in section 3.3. But for now we're going to take a
brief intermission to see how the MongoDB drivers work. This will shed more light on
some of MongoDB's design and better prepare you to use the drivers effectively.
3.2
How the drivers work
At this point it's natural to wonder what's going on behind the scenes when you issue
commands through a driver or via the MongoDB shell. In this section, we'll peel away
the curtain to see how the drivers serialize data and communicate it to the database.
All MongoDB drivers perform three major functions. First, they generate
MongoDB object
ID
s. These are the default values stored in the
_id
field of all docu-
ments. Next, the drivers convert any language-specific representation of documents to
and from
BSON
, the binary data format used by MongoDB. In the foregoing examples,
the driver serializes all the Ruby hashes into
BSON
and then deserializes the
BSON
that's returned from the database back to Ruby hashes.
The drivers' final function is to communicate with the database over a
TCP
socket
using the MongoDB wire protocol. The details of the protocol are beyond our scope.
But the style of socket communication, in particular whether writes on the socket wait
for a response, is important, and we'll explore the topic in this section.
3.2.1
Object ID generation
Every MongoDB document requires a primary key. That key, which must be unique
for all documents in each collection, is referenced by a document's
_id
field. Develop-
ers are free to use their own custom values as the
_id
, but when not provided, a
MongoDB object
ID
will be used. Before sending a document to the server, the driver
checks whether the
_id
field is present. If the field is missing, an object
ID
proper will
be generated and stored as
_id
.
Because a MongoDB object
ID
is a globally unique identifier, it's safe to assign the
ID
to a document at the client without having to worry about creating a duplicate
ID
.
Now, you've certainly seen object
ID
s in the wild, but you may not have noticed that
they're made up of 12 bytes. These bytes have a specific structure which is illustrated
in figure 3.1.
The most significant four bytes carry a standard Unix timestamp that encodes the
number of seconds since the epoch. The next three bytes store the machine
id
, which
is followed by a two-byte process
id
. The final three bytes store a process-local counter
that's incremented each time an object
ID
is generated.
4c291856 238d3b 19b2 000001
Figure 3.1
MongoDB object ID format
4-byte timestamp
machine id
process id
counter