Graphs in the Real World - Graph Databases

Databases Reference

In-Depth Information

Adding WORKED_WITH relationships

The query for finding colleagues and colleagues-of-colleagues with particular interests

is the one most frequently executed on Talent.net's site, and the success of the site de‐

pends in large part on its performance. The query uses pairs of WORKED_ON relationships

(for example, ('Sarah')-[:WORKED_ON]->('Next Gen Platform')<-[:WORKED_ON]-

('Charlie') ) to infer that users have worked with one another. Although reasonably

performant, this is nonetheless inefficient, because it requires traversing two explicit

relationships to infer the presence of a single implicit relationship.

To eliminate this inefficiency, Talent.net now precomputes WORKED_WITH relationships,

thereby enriching the data and providing shorter paths for these performance-critical

access patterns. As we discussed in “Iterative and Incremental Development” on page

72 , it's quite common to optimize graph access by adding a direct relationship between

two nodes that would otherwise be connected only by way of intermediaries.

In terms of the Talent.net domain, WORKED_WITH is a bidirectional relationship. In the

graph, however, it is implemented using a unidirectional relationship. Although a re‐

lationship's direction can often add useful semantics to its definition, in this instance

the direction is meaningless. This isn't a significant issue, so long as queries that operate

with WORKED_WITH relationships ignore the relationship direction.

Calculating a user's WORKED_WITH relationships and adding them to the graph isn't dif‐

ficult, nor is it particularly expensive in terms of resource consumption. It can, however,

add milliseconds to any end-user interactions that update a user's profile with new

project information, so Talent.net has decided to perform this operation asynchro‐

nously to end-user activities. Whenever a user changes his project history, Talent.net

adds a job that recalculates that user's WORKED_WITH relationships to a queue. A single

writer thread polls this queue and executes the jobs using the following Cypher

statement:

START subject = node :user(name= {name} )

MATCH (subject)-[:WORKED_ON]->()<-[:WORKED_ON]-(person)

WHERE NOT ((subject)-[:WORKED_WITH]-(person))

WITH DISTINCT subject, person

CREATE UNIQUE (subject)-[:WORKED_WITH]-(person)

RETURN subject.name AS startName, person.name AS endName

Figure 5-7 shows what our sample graph looks like once it has been enriched with

WORKED_WITH relationships.

Search WWH ::

Custom Search

Home