Java Reference
In-Depth Information
In the remainder of this section, we prove a slightly weaker result. We
show that any sequence of M = Ω ( N ) union and find operations takes a total of
O ( M log* N ) time. The same bound holds if we replace union-by-rank with
union-by-size. This analysis is probably the most complex in this text and is
one of the first truly complex analyses ever performed for an algorithm that is
essentially trivial to implement. By extending this technique, we can show the
stronger bound claimed previously.
24.6.1 analysis of the union/find algorithm
In this section, we establish a fairly tight bound on the running time of a
sequence of M = Ω ( N ) union and find operations. The union and find opera-
tions may occur in any order, but union is done by rank and find is done with
path compression.
We begin with some theorems concerning the number of nodes of rank r .
Intuitively, because of the union-by-rank rule, there are many more nodes of small
rank than of large rank. In particular, there can be at most one node of rank log N .
What we want to do is to produce as precise a bound as possible on the number of
nodes of any particular rank r . Because ranks change only when union operations
are performed (and then only when the two trees have the same rank), we can
prove this bound by ignoring path compression. We do so in Theorem 24.1.
Theorem 24.1
In the absence of path compression, when a sequence of union instructions is being
executed, a node of rank r must have 2 r descendants (including itself).
Proof
The proof is by induction. The basis r = 0 is clearly true. Let T be the tree of rank r
with the fewest number of descendants and x be T 's root. Suppose that the last
union with which x was involved was between T 1 and T 2 . Suppose that T 1 's root was
x . If T 1 had rank r , then T 1 would be a tree of rank r with fewer descendants than T .
This condition contradicts the assumption that T is the tree with the smallest number
of descendants. Hence the rank of T 1 is at most r - 1 . The rank of T 2 is at most the
rank of T 1 because of union-by-rank. As T has rank r and the rank could only
increase because of T 2 , it follows that the rank of T 2 is r - 1 . Then the rank of T 1 is
also r - 1 . By the induction hypothesis, each tree has at least 2 r - 1 descendants,
giving a total of 2 r and establishing the theorem.
Theorem 24.1 says that if no path compression is performed, any node of
rank r must have at least 2 r descendants. Path compression can change this
condition, of course, because it can remove descendants from a node. However,
 
Search WWH ::




Custom Search