Data Modeling with Graphs - Graph Databases

Databases Reference

In-Depth Information

Here we retrieve all the emails that Bob has sent where he's CC'd one of his own aliases.

Any emails that match this pattern are indicative of rogue behavior. And because both

Cypher and the underlying graph database have graph affinity, these queries—even over

large datasets—run very quickly. This query returns the following results:

+------------------------------------------+

| email |

+------------------------------------------+

| Node[6]{id:"1",content:"email contents"} |

+------------------------------------------+

1 row

Evolving the Domain

As with any database, our graph serves a system that is likely to evolve over time. So

what should we do when the graph evolves? How do we know what breaks, or indeed,

how do we even tell that something has broken? The fact is, we can't avoid migrations

in a graph database: they're a fact of life, just as with any data store. But in a graph

database they're often simpler.

In a graph, to add new facts or compositions, we tend to add new nodes and relationships

rather than changing the model in place. Adding to the graph using new kinds of rela‐

tionships will not affect any existing queries, and is completely safe. Changing the graph

using existing relationship types, and changing the properties (not just the property

values) of existing nodes might be safe, but we need to run a representative set of queries

to maintain confidence that the graph is still fit for purpose after the the structural

changes. However, these activities are precisely the same kinds of actions we perform

during normal database operation, so in a graph world a migration really is just business

as normal.

At this point we have a graph that describes who sent and received emails, as well as the

content of the emails themselves. But of course, one of the joys of email is that recipients

can forward or reply to an email they've received. This increases interaction and knowl‐

edge sharing, but in some cases leaks critical business information. Given we're looking

for suspicious communication patterns, it makes sense for us to also take into account

forwarding and replies.

At first glance, there would appear to be no need to use database migrations to update

our graph to support our new use case. The simplest additions we can make involve

adding FORWARDED and REPLIED_TO relationships to the graph, as shown in

Figure 3-11 . Doing so won't affect any preexisting queries because they aren't coded to

recognize the new relationships.

Search WWH ::

Custom Search

Home