Databases Reference
In-Depth Information
This subquery works as follows:
•
WITH
pipes one or more triples, comprising
upLeg
,
downLeg
, and
topRoute
, to the
current query. There will be one triple for each of the paths matched by the third
subquery, with each path being bound to
topRoute
in successive triples (the paths
bound to
upLeg
and
downLeg
will stay the same, because the first and second sub‐
queries matched only one path each). Each triple is accompanied by a
score
for the
path bound to
topRoute
for that triple. This score is calculated using Cypher's
reduce
function, which for each triple sums the
cost
properties on the relationships
in the path currently bound to
topRoute
;
reduce
was introduced with Neo4j 1.9.
The triples are then ordered by this
score
, lowest first, and then limited to the first
triple in the sorted list.
•
RETURN
sums the nodes in the paths
upLeg
,
topRoute
, and
downLeg
to produce the
final results. The
tail
function drops the first node in each of paths
topRoute
and
downLeg
, because that node will already be present in the preceding path.
Implementing route calculation with the traversal framework
The time-critical nature of the route calculation coupled with the high throughput of
the parcel network impose strict demands on the route calculation engine. As long as
the individual query latencies are low enough, it's always possible to scale horizontally
for increased throughput. The Cypher-based solution is fast, but with such high sus‐
tained throughput, every millisecond impacts the cluster footprint. For this reason,
Global Post adopted an alternative approach: calculating routes using Neo4j's traversal
framework.
A traversal-based implementation of the route calculation engine must solve two prob‐
lems: finding shortest paths, and filtering paths based on time period. We'll look at
filtering paths based on time period first.
Traversals should follow only relationships that are valid for the specified delivery pe‐
riod. In other words, as the traversal progresses through the graph, it should only be
presented with relationships whose periods of validity, as defined by their
start_date
and
end_date
properties, contain the specified delivery period.
We implement this relationship filtering using a
PathExpander
. Given a path from a
traversal's start node to the node where it is currently positioned, a
PathExpander
's
expand()
method returns the relationships that can be used to traverse further. This
method is called by the traversal framework each time the framework advances another
node into the graph. If needed, the client can supply some initial state that flows along
each branch;
expand()
can then use (and even change) this state in the course of de‐
ciding which relationships to return. The route calculator's
ValidPathExpander
imple‐
mentation uses this branch state to supply the delivery period to the expander.