Database Reference
In-Depth Information
combined, say by using the more general patterns of communication permitted by a work-
flow system (see Section 2.4.1 ).
Path Facts Versus Paths
We should be careful to distinguish between a path, which is a sequence of arcs, and a path fact, which is a statement
that there exists a path from some node x to some node y . The path fact has been shown typically as Path ( x, y ). Smart
transitive closure discovers each path only once, but it may discover a path fact more than once. The reason is that
often a graph will have many paths from x to y , and may even have many different paths from x to y that are of the
same length.
Not all paths are discovered independently by smart transitive closure. For instance, if there are arcs w x y
z and also arcs x u z , then the path fact Path ( w, z ) will be discovered twice, once by combining Path ( w, y ) with
Path ( y, z ) and again when combining Path ( w, u ) with Path ( u, z ). On the other hand, if the arcs were w x y z
and w v y , then Path ( w, z ) would be discovered only once, when combining Path ( w, y ) with Path ( y, z ).
10.8.6
Transitive Closure by Graph Reduction
A typical directed graph such as the Web contains many strongly connected components
(SCCs). We can collapse an SCC to a single node as far as computing the transitive closure
is concerned, since all the nodes in an SCC reach exactly the same nodes. There is an eleg-
ant algorithm for finding the SCCs of a graph in time linear in the size of the graph, due to
J.E. Hopcroft and R.E. Tarjan. However, this algorithm is inherently sequential, based on
depth-first search, and so not well suited to parallel impelementation on large graphs.
We can find most of the SCCs in a graph by some random node selections followed by
two breadth-first searches. Moreover, the larger an SCC is, the more likely it is to be col-
lapsed early, thus reducing the size of the graph quickly. The algorithm for reducing SCCs
to single nodes is as follows. Let G be the graph to be reduced, and let G ′ be G with all the
arcs reversed.
(1) Pick a node v from G at random.
(2) Find N G ( v, ∞), the set of nodes reachable from v in G .
(3) Find N G ( v, ∞), the set of nodes that v reaches in the graph G ′ that has the arcs of G
reversed. Equivalently, this set is those nodes that reach v in G .
(4) Construct the SCC S containing v , which is N G ( v, ∞) ∩ N G ( v, ∞). That is, v and u are
in the same SCC of G if and only if v can reach u and u can reach v .
(5) Replace SCC S by a single node s in G . To do so, delete all nodes in S from G and add
s to the node set of G . Delete from G all arcs of which one or both ends are in S . Then,
add to the arc set of G an arc s x whenever there was an arc in G from any member
of S to x . Finally, add an arc x s if there was an arc from x to any member of S .
Search WWH ::




Custom Search