Java Reference
In-Depth Information
In the worst case, where data is not uniformly distributed, the running
time could be linear and every item might be examined. In Exercise 5.25 you
are asked to construct such a case. However, if we assume that the items are
reasonably distributed, as with a phone book, the average number of compari-
sons has been shown to be . In other words, we apply the loga-
rithm twice in succession. For N = 4 billion, is about 32 and
is roughly 5. Of course, there are some hidden constants in the Big-Oh nota-
tion, but the extra logarithm can lower the number of iterations considerably,
so long as a bad case does not crop up. Proving the result rigorously, however,
is quite complicated.
Interpolation search
has a better Big-Oh
bound on average
than does binary
search, but has lim-
ited practicality and
a bad worst case.
O
(
log
log
N
)
log N
log
log N
checking an algorithm analysis
5.7
Once we have performed an algorithm analysis, we want to determine
whether it is correct and as good as we can possibly make it. One way to do
this is to code the program and see if the empirically observed running time
matches the running time predicted by the analysis.
When N increases by a factor of 10, the running time goes up by a factor
of 10 for linear programs, 100 for quadratic programs, and 1,000 for cubic
programs. Programs that run in take slightly more than 10 times
as long to run under the same circumstances. These increases can be hard to
spot if the lower-order terms have relatively large coefficients and N is not
large enough. An example is the jump from N = 10 to N = 100 in the running
time for the various implementations of the maximum contiguous subse-
quence sum problem. Differentiating linear programs from
O ( N log N )
O ( N log N )
pro-
grams, based purely on empirical evidence, also can be very difficult.
Another commonly used trick to verify that some program is
is to compute the values for a range of N (usually spaced out by
factors of 2), where is the empirically observed running time. If
is a tight answer for the running time, then the computed values converge to a
positive constant. If is an overestimate, the values converge to zero. If
is an underestimate, and hence wrong, the values diverge.
As an example, suppose that we write a program to perform N random
searches using the binary search algorithm. Since each search is logarithmic,
we expect the total running time of the program to be .
Figure 5.13 shows the actual observed running time for the routine for various
input sizes on a real (but extremely slow) computer. The last column is most
likely the converging column and thus confirms our analysis, whereas the
increasing numbers for
OF ( )
(
)
TN
()
F ( )
T ( )
F ( )
F ( )
F ( )
O ( N log N )
TN
suggest that
O ( N )
is an underestimate, and the
quickly decreasing values for
TN 2
suggest that
ON ( )
is an overestimate.
 
 
Search WWH ::




Custom Search