Databases Reference
In-Depth Information
• Use nodes to represent entities—that is, the things that are of interest to us in our
domain.
• Use relationships both to express the connections between entities and to establish
semantic context for each entity, thereby structuring the domain.
• Use relationship direction to further clarify relationship semantics. Many relation‐
ships are asymmetrical, which is why relationships in a property graph are always
directed. For bidirectional relationships, we should make our queries ignore
direction.
• Use node properties to represent entity attributes, plus any necessary entity meta‐
data, such as timestamps, version numbers, etc.
• Use relationship properties to express the strength, weight, or quality of a relation‐
ship, plus any necessary relationship metadata, such as timestamps, version num‐
bers, etc.
It pays to be diligent about discovering and capturing domain entities. As we saw in
Chapter 3 , it's relatively easy to model things that really ought to be represented as nodes
using sloppily named relationships instead. If we're tempted to use a relationship to
model an entity—an email, or a review, for example—we must make certain that this
entity cannot be related to more than two other entities. Remember, a relationship must
have a start node and an end node—nothing more, nothing less. If we find later that we
need to connect something we've modeled as a relationship to more than two other
entities, we'll have to refactor the entity inside the relationship out into a separate node.
This is a breaking change to the data model, and will likely require us to make changes
to any queries and application code that produce or consume the data.
Fine-Grained versus Generic Relationships
When designing relationship types we should be mindful of the trade-offs between using
fine-grained relationship labels versus generic relationships qualified with properties.
It's the difference between using DELIVERY_ADDRESS and HOME_ADDRESS versus ADDRESS
{type: 'delivery'} and ADDRESS {type: 'home'} .
Relationships are the royal road into the graph. Differentiating by relationship type is
the best way of eliminating large swathes of the graph from a traversal. Using one or
more property values to decide whether or not to follow a relationship incurs extra IO
the first time those properties are accessed because the properties reside in a separate
store file from the relationships (after that, however, they're cached).
We use fine-grained relationships whenever we have a closed set of relationship types.
In contrast, weightings—as required by a shortest-weighted-path algorithm—rarely
comprise a closed set, and these are usually best represented as properties on
relationships.
Search WWH ::




Custom Search