Database Reference
In-Depth Information
In this example, the track list information is embedded in the document itself. This approach is both incredibly
efficient and well organized. All the information that you wish to store regarding this CD is added to a single
document. In the relational version of the CD database, this requires at least two tables; in the nonrelational database,
it requires only one collection and one document.
When information is retrieved for a given CD, that information only needs to be loaded from one document into
RAM, not from multiple documents. Remember that every reference requires another query in the database.
■
the rule of thumb when using MongoDB is to embed data whenever you can. this approach is far more efficient
and almost always viable.
Tip
At this point, you might be wondering about the use case in which an application has multiple users. Generally
speaking, a relational database version of the aforementioned CD app would require that you have one table that
contains all your users and two tables for the items added. For a nonrelational database, it would be good practice
to have separate collections for the users and the items added. For these kinds of problems, MongoDB allows you to
create references in two ways: manually or automatically. In the latter case, you use the DBRef specification, which
provides more flexibility in case a collection changes from one document to the next. You will learn more about these
two approaches in Chapter 4.
Creating the _id Field
Every object within the MongoDB database contains a unique identifier to distinguish that object from every other
object. This identifier is called the _
id
key, and it is added automatically to every document you create in a collection.
The
_id
key is the first attribute added in each new document you create. This remains true even if you do not tell
MongoDB to create the key. For example, none of the code in the preceding examples used the
_id
key. Nevertheless,
MongoDB created an
_id
key for you automatically in each document. It did so because
_id
key is a mandatory
element for each document in the collection.
If you do not specify the
_id
value manually, the type will be set to a special BSON datatype that consists of a
12-byte binary value. Thanks to its design, this value has a reasonably high probability of being unique. The 12-byte
value consists of a 4-byte timestamp (seconds since epoch, or January 1
st
, 1970), a 3-byte machine ID, a 2-byte
process ID, and a 3-byte counter. It's good to know that the
counter
and timestamp fields are stored in
Big Endian
format.
This is because MongoDB wants to ensure that there is an increasing order to these values, and a Big Endian approach
suits this requirement best.
■
the terms
Big Endian
and
Little Endian
refer to how individual bytes/bits are stored in a longer data word in the
memory. Big endian simply means that the most significant value is saved first. Similarly, little endian means that the
least significant value is saved first.
Note
Figure
3-3
shows how the value of the
_id
key is built up and where the values come from.
012345678910 11
T
ime
machine
Pid
inc
Figure 3-3.
Creating the _id key in MongoDB