Databases Reference
In-Depth Information
We might think we can remedy the situation by adding properties to the EMAILED
relationship to represent an email's attributes, but that's just playing for time. Even with
properties attached to each EMAILED relationship, we would still be unable to correlate
between the EMAILED , CC , and BCC relationships; that is, we would be unable to say which
emails were copied versus which were blind-copied, and to whom.
The fact is we've unwittingly made a simple modeling mistake, caused mostly by a lax
use of English rather than any shortcomings of graph theory. Our everyday use of lan‐
guage has lead us to focus on the verb “emailed” rather than the email itself, and as a
result we've produced a model lacking domain insight.
In English, it's easy and convenient to shorten the phrase “Bob sent an email to Charlie”
to “Bob emailed Charlie”. In most cases that loss of a noun (the actual email) doesn't
matter because the intent is still clear. But when it comes to our forensics scenario, these
elided statements are problematic. The intent remains the same, but the details of the
number, contents, and recipients of the emails that Bob sent have been lost through
having been folded into a relationship EMAILED , rather than being modeled explicitly as
nodes in their own right.
Second Time's the Charm
To fix our lossy model, we need to insert email nodes to represent the real emails ex‐
changed within the business, and expand our set of relationship names to encompass
the full set of addressing fields that email supports. Now instead of creating lossy struc‐
tures like this:
CREATE (bob)-[:EMAILED]->(charlie)
we'll instead create more detailed structures, like this:
CREATE (email_1 { id : '1' , content: 'Hi Charlie, ... Kind regards, Bob' }),
(bob)-[:SENT]->(email_1),
(email_1)-[:TO]->(charlie),
(email_1)-[:CC]->(davina),
(email_1)-[:CC]->(alice),
(email_1)-[:BCC]->(edward)
This results in the kind of graph structure we see in Figure 3-9 .
Search WWH ::




Custom Search