Internal Sorting - Data Structures and Algorithm Analysis

Java Reference

In-Depth Information

Or can we prove that there is no faster sorting algorithm by finding a tighter lower

bound?

This section presents one of the most important and most useful proofs in com-

puter science: No sorting algorithm based on key comparisons can possibly be

faster than (n log n) in the worst case. This proof is important for three reasons.

First, knowing that widely used sorting algorithms are asymptotically optimal is re-

assuring. In particular, it means that you need not bang your head against the wall

searching for an O(n) sorting algorithm (or at least not one in any way based on key

comparisons). Second, this proof is one of the few non-trivial lower-bounds proofs

that we have for any problem; that is, this proof provides one of the relatively few

instances where our lower bound is tighter than simply measuring the size of the

input and output. As such, it provides a useful model for proving lower bounds on

other problems. Finally, knowing a lower bound for sorting gives us a lower bound

in turn for other problems whose solution could be used as the basis for a sorting

algorithm. The process of deriving asymptotic bounds for one problem from the

asymptotic bounds of another is called a reduction, a concept further explored in

Chapter 17.

Except for the Radix Sort and Binsort, all of the sorting algorithms presented

in this chapter make decisions based on the direct comparison of two key values.

For example, Insertion Sort sequentially compares the value to be inserted into the

sorted list until a comparison against the next value in the list fails. In contrast,

Radix Sort has no direct comparison of key values. All decisions are based on the

value of specific digits in the key value, so it is possible to take approaches to sorting

that do not involve key comparisons. Of course, Radix Sort in the end does not

provide a more efficient sorting algorithm than comparison-based sorting. Thus,

empirical evidence suggests that comparison-based sorting is a good approach. 3

The proof that any comparison sort requires (n log n) comparisons in the

worst case is structured as follows. First, comparison-based decisions can be mod-

eled as the branches in a tree. This means that any sorting algorithm based on com-

parisons between records can be viewed as a binary tree whose nodes correspond to

the comparisons, and whose branches correspond to the possible outcomes. Next,

the minimum number of leaves in the resulting tree is shown to be the factorial of

n. Finally, the minimum depth of a tree with n! leaves is shown to be in (n log n).

Before presenting the proof of an (n log n) lower bound for sorting, we first

must define the concept of a decision tree. A decision tree is a binary tree that can

model the processing for any algorithm that makes binary decisions. Each (binary)

decision is represented by a branch in the tree. For the purpose of modeling sorting

algorithms, we count all comparisons of key values as decisions. If two keys are

3 The truth is stronger than this statement implies. In reality, Radix Sort relies on comparisons as

well and so can be modeled by the technique used in this section. The result is an (nlogn) bound

in the general case even for algorithms that look like Radix Sort.

Search WWH ::

Custom Search

Home