Database Reference
In-Depth Information
Starting with our logical data model, we prepare the initial MongoDB structure by deciding
for each relationship whether to embed or reference (Step 1). After making these changes
to the model, we can determine where to accommodate history (Step 2). History means
keeping track of how things change over time. There most likely will be other structural
refinement needed such as adding indexes or sharding; both techniques are used to further
improve retrieval performance (Step 3). In Step 4, review the work to get approval. Fre-
quently, action items come out of the review and confirm step where we need go back
to Step 1 and refine the structure. Expect iteration through these steps. Once we have re-
ceived final confirmation from our work, we can move on to implementation. Let's talk
more about each of these five steps.
STEP 1: EMBED OR REFERENCE
MongoDB resolves relationships in one of two ways: embed or reference. Recall from
Chapter 3 that embedded documents resolve relationships between data by storing related
data in a single document structure, and a reference is a pointer to another document. Mon-
goDB documents make it possible to embed document structures as sub-documents in a
field or an array within a document. Embed is equivalent to the concept of denormalization,
in which entities are combined into one structure with the goal of creating a more simplist-
ic and performant structure, usually at the cost of extra redundancy and, therefore, storage
space. One structure is often easier than five structures from which to retrieve data. Also,
the one structure can represent a logical concept such as a survey or invoice or claim.
If I need to retrieve data, it can (most of the time) take more time to get the data out of
multiple documents by referencing than by embedding everything into a single document.
That is, having everything in one document is often better for retrieval performance than
referencing. If I frequently query on both Order and Product , having them in one docu-
ment would most likely take less time to view than having them in separate documents and
having the Order reference the Product .
Notice the language though in the preceding paragraph; “most of the time” and “most
likely” imply that embedding is not always faster, nor should retrieval speed be the only
factor to consider when deciding whether to embed or reference. The top five reasons for
embedding over referencing are:
1 Requirements state that data from two or more entities are frequently queried
together . If we are often viewing data from multiple entities together, it can
make sense to put them into one document. It is important to note that this
factor, which is possibly the most important factor in deciding whether to em-
Search WWH ::




Custom Search