Database Reference
In-Depth Information
take the union of the result of this query with the Arc relation itself, then we get all paths
of length between 1 and 2 i +1 and can use the union as the Path relation in the next round
of recursive doubling. The query itself can be implemented by two MapReduce jobs, one
to do the join and the other to do the union and eliminate duplicates. As we observed for
the parallel reachability computation, the methods of Sections 2.3.7 and 2.3.8 suffice. The
union, discussed in Section 2.3.6 doesn't really require a MapReduce job of its own; it can
be combined with the duplicate-elimination.
If a graph has diameter d , then after log 2 d rounds of the above algorithm Path contains
all pairs ( x, y ) connected by a path of length at most d ; that is, it contains all pairs in the
transitive closure. Unless we already know d , one more round will be necessary to verify
that no more pairs can be found, but for large d , this process takes many fewer rounds than
the breadth-first search that we used for reachability.
However, the above recursive-doubling method does a lot of redundant work. An ex-
ample should make the point clear.
EXAMPLE 10.28 Suppose the shortest path from x 0 to x 17 is of length 17; in particular, let
there be a path x 0 x 1 → · · · → x 17 . We shall discover the fact Path ( x 0 , x 17 ) on the fifth
round, when Path contains all pairs connected by paths of length up to 16. The same path
from x 0 to x 17 will be discovered 16 times when we join Path with itself. That is, we can
take the fact Path ( x 0 , x 16 ) and combine it with Path ( x 16 , x 17 ) to obtain Path ( x 0 , x 17 ). Or we
can combine Path ( x 0 , x 15 ) with Path ( x 15 , x 17 ) to discover the same fact, and so on.
10.8.5
Smart Transitive Closure
A variant of recursive doubling that avoids discovering the same path more than once is
called smart transitive closure. Every path of length greater than 1 can be broken into a
head whose length is a power of 2, followed by a tail whose length is no greater than the
length of the head.
EXAMPLE 10.29 A path of length 13 has a head consisting of the first 8 arcs, followed by a
tail consisting of the last 5 arcs. A path of length 2 is a head of length 1 followed by a tail
of length 1. Note that 1 is a power of 2 (the 0th power), and the tail will be as long as the
head whenever the path itself has a length that is a power of 2.
To implement smart transitive closure in SQL, we introduce a relation Q ( X, Y ) whose
function after the i th round is to hold all pairs of nodes ( x, y ) such that the shortest path
from x to y is of length exactly 2 i . Also, after the i th round, Path ( x, y ) will be true if the
shortest path from x to y is of length at most 2 i +1 −1. Note that this interpretation of Path is
Search WWH ::




Custom Search