Java Reference
In-Depth Information
• Equivalently, a tree with n nodes requires at least dlog(n + 1)e levels.
What is the minimum number of nodes that must be in the decision tree for any
comparison-based sorting algorithm for n values? Because sorting algorithms are
in the business of determining which unique permutation of the input corresponds
to the sorted list, the decision tree for any sorting algorithm must contain at least
one leaf node for each possible permutation. There are n! permutations for a set of
n numbers (see Section 2.2).
Because there are at least n! nodes in the tree, we know that the tree must
have (log n!) levels. From Stirling's approximation (Section 2.2), we know log n!
is in (n log n). The decision tree for any comparison-based sorting algorithm
must have nodes (n log n) levels deep. Thus, in the worst case, any such sorting
algorithm must require (n log n) comparisons.
Any sorting algorithm requiring (n log n) comparisons in the worst case re-
quires (n log n) running time in the worst case. Because any sorting algorithm
requires (n log n) running time, the problem of sorting also requires (n log n)
time. We already know of sorting algorithms with O(n log n) running time, so we
can conclude that the problem of sorting requires (n log n) time. As a corol-
lary, we know that no comparison-based sorting algorithm can improve on existing
(n log n) time sorting algorithms by more than a constant factor.
7.10
Further Reading
The definitive reference on sorting is Donald E. Knuth's Sorting and Searching
[Knu98]. A wealth of details is covered there, including optimal sorts for small
size n and special purpose sorting networks. It is a thorough (although somewhat
dated) treatment on sorting. For an analysis of Quicksort and a thorough survey
on its optimizations, see Robert Sedgewick's Quicksort [Sed80]. Sedgewick's Al-
gorithms [Sed11] discusses most of the sorting algorithms described here and pays
special attention to efficient implementation. The optimized Mergesort version of
Section 7.4 comes from Sedgewick.
While (n log n) is the theoretical lower bound in the worst case for sorting,
many times the input is sufficiently well ordered that certain algorithms can take
advantage of this fact to speed the sorting process. A simple example is Insertion
Sort's best-case running time. Sorting algorithms whose running time is based on
the amount of disorder in the input are called adaptive. For more information on
adaptive sorting algorithms, see “A Survey of Adaptive Sorting Algorithms” by
Estivill-Castro and Wood [ECW92].
7.11
Exercises
7.1 Using induction, prove that Insertion Sort will always produce a sorted array.
Search WWH ::




Custom Search