Lower Bounds - Data Structures and Algorithm Analysis

Java Reference

In-Depth Information

Proof1: The winner must compare against all other elements, so there must be

n 1 comparisons. 2

This proof is clearly wrong, because the winner does not need to explicitly com-

pare against all other elements to be recognized. For example, a standard single-

elimination playoff sports tournament requires only n 1 comparisons, and the

winner does not play every opponent. So let's try again.

Proof2: Only the winner does not lose. There are n 1 losers. A single compar-

ison generates (at most) one (new) loser. Therefore, there must be n 1 compar-

isons. 2

This proof is sound. However, it will be useful later to abstract this by introduc-

ing the concept of posets as we did in Section 15.2.1. We can view the maximum-

finding problem as starting with a poset where there are no known relationships, so

every member of the collection is in its own separate DAG of one element.

Proof2a: To find the largest value, we start with a poset of n DAGs each with

a single element, and we must build a poset having all elements in one DAG such

that there is one maximum value (and by implication, n 1 losers). We wish to

connect the elements of the poset into a single DAG with the minimum number of

links. This requires at least n 1 links. A comparison provides at most one new

link. Thus, a minimum of n 1 comparisons must be made. 2

What is the average cost of largest ? Because it always does the same num-

ber of comparisons, clearly it must cost n 1 comparisons. We can also consider

the number of assignments that largest must do. Function largest might do

an assignment on any iteration of the for loop.

Because this event does happen, or does not happen, if we are given no informa-

tion about distribution we could guess that an assignment is made after each com-

parison with a probability of one half. But this is clearly wrong. In fact, largest

does an assignment on the ith iteration if and only if A [i] is the biggest of the the

first i elements. Assuming all permutations are equally likely, the probability of

this being true is 1=i. Thus, the average number of assignments done is

n

X

1

i =

1

i

1 +

i=2

i=1

which is the Harmonic Series H n . H n = (log n). More exactly, H n is close to

log e n.

How “reliable” is this average? That is, how much will a given run of the

program deviate from the mean cost? According to Cebysev's Inequality, an obser-

vation will fall within two standard deviations of the mean at least 75% of the time.

For Largest , the variance is

H n 2

6

= log e n 2

6 :

Data Structures and Algorithm Analysis

Search WWH ::

Custom Search

Home