Java Reference
In-Depth Information
is still unlikely to happen. It does not take many good partitionings for Quicksort
to work fairly well.
Quicksort's best case occurs when findpivot always breaks the array into
two equal halves. Quicksort repeatedly splits the array into smaller partitions, as
shown in Figure 7.14. In the best case, the result will be log n levels of partitions,
with the top level having one array of size n, the second level two arrays of size n=2,
the next with four arrays of size n=4, and so on. Thus, at each level, all partition
steps for that level do a total of n work, for an overall cost of n log n work when
Quicksort finds perfect pivots.
Quicksort's average-case behavior falls somewhere between the extremes of
worst and best case. Average-case analysis considers the cost for all possible ar-
rangements of input, summing the costs and dividing by the number of cases. We
make one reasonable simplifying assumption: At each partition step, the pivot is
equally likely to end in any position in the (sorted) array. In other words, the pivot
is equally likely to break an array into partitions of sizes 0 and n1, or 1 and n2,
and so on.
Given this assumption, the average-case cost is computed from the following
equation:
n1
X
1
n
T(n) = cn +
[T(k) + T(n 1 k)]; T(0) = T(1) = c:
k=0
This equation is in the form of a recurrence relation. Recurrence relations are
discussed in Chapters 2 and 14, and this one is solved in Section 14.2.4. This
equation says that there is one chance in n that the pivot breaks the array into
subarrays of size 0 and n 1, one chance in n that the pivot breaks the array into
subarrays of size 1 and n2, and so on. The expression “T(k) + T(n1k)” is
the cost for the two recursive calls to Quicksort on two arrays of size k and n1k.
The initial cn term is the cost of doing the findpivot and partition steps, for
some constant c. The closed-form solution to this recurrence relation is (n log n).
Thus, Quicksort has average-case cost (n log n).
This is an unusual situation that the average case cost and the worst case cost
have asymptotically different growth rates. Consider what “average case” actually
means. We compute an average cost for inputs of size n by summing up for every
possible input of size n the product of the running time cost of that input times the
probability that that input will occur. To simplify things, we assumed that every
permutation is equally likely to occur. Thus, finding the average means summing
up the cost for every permutation and dividing by the number of inputs (n!). We
know that some of these n! inputs cost O(n 2 ). But the sum of all the permutation
costs has to be (n!)(O(n log n)). Given the extremely high cost of the worst inputs,
there must be very few of them. In fact, there cannot be a constant fraction of the
inputs with cost O(n 2 ). Even, say, 1% of the inputs with cost O(n 2 ) would lead to
Search WWH ::




Custom Search