Building a Graph Database Application - Graph Databases

Databases Reference

In-Depth Information

• Use nodes to represent entities—that is, the things that are of interest to us in our

domain.

• Use relationships both to express the connections between entities and to establish

semantic context for each entity, thereby structuring the domain.

• Use relationship direction to further clarify relationship semantics. Many relation‐

ships are asymmetrical, which is why relationships in a property graph are always

directed. For bidirectional relationships, we should make our queries ignore

direction.

• Use node properties to represent entity attributes, plus any necessary entity meta‐

data, such as timestamps, version numbers, etc.

• Use relationship properties to express the strength, weight, or quality of a relation‐

ship, plus any necessary relationship metadata, such as timestamps, version num‐

bers, etc.

It pays to be diligent about discovering and capturing domain entities. As we saw in

Chapter 3 , it's relatively easy to model things that really ought to be represented as nodes

using sloppily named relationships instead. If we're tempted to use a relationship to

model an entity—an email, or a review, for example—we must make certain that this

entity cannot be related to more than two other entities. Remember, a relationship must

have a start node and an end node—nothing more, nothing less. If we find later that we

need to connect something we've modeled as a relationship to more than two other

entities, we'll have to refactor the entity inside the relationship out into a separate node.

This is a breaking change to the data model, and will likely require us to make changes

to any queries and application code that produce or consume the data.

Fine-Grained versus Generic Relationships

When designing relationship types we should be mindful of the trade-offs between using

fine-grained relationship labels versus generic relationships qualified with properties.

It's the difference between using DELIVERY_ADDRESS and HOME_ADDRESS versus ADDRESS

{type: 'delivery'} and ADDRESS {type: 'home'} .

Relationships are the royal road into the graph. Differentiating by relationship type is

the best way of eliminating large swathes of the graph from a traversal. Using one or

more property values to decide whether or not to follow a relationship incurs extra IO

the first time those properties are accessed because the properties reside in a separate

store file from the relationships (after that, however, they're cached).

We use fine-grained relationships whenever we have a closed set of relationship types.

In contrast, weightings—as required by a shortest-weighted-path algorithm—rarely

comprise a closed set, and these are usually best represented as properties on

relationships.

Search WWH ::

Custom Search

Home