Databases Reference
In-Depth Information
Adding WORKED_WITH relationships
The query for finding colleagues and colleagues-of-colleagues with particular interests
is the one most frequently executed on Talent.net's site, and the success of the site de‐
pends in large part on its performance. The query uses pairs of
WORKED_ON
relationships
(for example,
('Sarah')-[:WORKED_ON]->('Next Gen Platform')<-[:WORKED_ON]-
('Charlie')
) to infer that users have worked with one another. Although reasonably
performant, this is nonetheless inefficient, because it requires traversing two explicit
relationships to infer the presence of a single implicit relationship.
To eliminate this inefficiency, Talent.net now precomputes
WORKED_WITH
relationships,
thereby enriching the data and providing shorter paths for these performance-critical
access patterns. As we discussed in
“Iterative and Incremental Development” on page
72
, it's quite common to optimize graph access by adding a direct relationship between
two nodes that would otherwise be connected only by way of intermediaries.
In terms of the Talent.net domain,
WORKED_WITH
is a bidirectional relationship. In the
graph, however, it is implemented using a unidirectional relationship. Although a re‐
lationship's direction can often add useful semantics to its definition, in this instance
the direction is meaningless. This isn't a significant issue, so long as queries that operate
with
WORKED_WITH
relationships ignore the relationship direction.
Calculating a user's
WORKED_WITH
relationships and adding them to the graph isn't dif‐
ficult, nor is it particularly expensive in terms of resource consumption. It can, however,
add milliseconds to any end-user interactions that update a user's profile with new
project information, so Talent.net has decided to perform this operation asynchro‐
nously to end-user activities. Whenever a user changes his project history, Talent.net
adds a job that recalculates that user's
WORKED_WITH
relationships to a queue. A single
writer thread polls this queue and executes the jobs using the following Cypher
statement:
START
subject =
node
:user(name=
{name}
)
MATCH
(subject)-[:WORKED_ON]->()<-[:WORKED_ON]-(person)
WHERE
NOT
((subject)-[:WORKED_WITH]-(person))
WITH DISTINCT
subject, person
CREATE UNIQUE
(subject)-[:WORKED_WITH]-(person)
RETURN
subject.name
AS
startName, person.name
AS
endName
Figure 5-7
shows what our sample graph looks like once it has been enriched with
WORKED_WITH
relationships.