Java Reference
In-Depth Information
3.11
Empirical Analysis
This chapter has focused on asymptotic analysis. This is an analytic tool, whereby
we model the key aspects of an algorithm to determine the growth rate of the alg-
orithm as the input size grows. As pointed out previously, there are many limita-
tions to this approach. These include the effects at small problem size, determining
the finer distinctions between algorithms with the same growth rate, and the inher-
ent difficulty of doing mathematical modeling for more complex problems.
An alternative to analytical approaches are empirical ones. The most obvious
empirical approach is simply to run two competitors and see which performs better.
In this way we might overcome the deficiencies of analytical approaches.
Be warned that comparative timing of programs is a difficult business, often
subject to experimental errors arising from uncontrolled factors (system load, the
language or compiler used, etc.). The most important point is not to be biased in
favor of one of the programs. If you are biased, this is certain to be reflected in
the timings. One look at competing software or hardware vendors' advertisements
should convince you of this. The most common pitfall when writing two programs
to compare their performance is that one receives more code-tuning effort than the
other. As mentioned in Section 3.10, code tuning can often reduce running time by
a factor of ten. If the running times for two programs differ by a constant factor
regardless of input size (i.e., their growth rates are the same), then differences in
code tuning might account for any difference in running time.
Be suspicious of
empirical comparisons in this situation.
Another approach to analysis is simulation. The idea of simulation is to model
the problem with a computer program and then run it to get a result. In the con-
text of algorithm analysis, simulation is distinct from empirical comparison of two
competitors because the purpose of the simulation is to perform analysis that might
otherwise be too difficult. A good example of this appears in Figure 9.10. This
figure shows the cost for inserting or deleting a record from a hash table under two
different assumptions for the policy used to find a free slot in the table. The y axes
is the cost in number of hash table slots evaluated, and the x axes is the percentage
of slots in the table that are full. The mathematical equations for these curves can
be determined, but this is not so easy. A reasonable alternative is to write simple
variations on hashing. By timing the cost of the program for various loading con-
ditions, it is not difficult to construct a plot similar to Figure 9.10. The purpose of
this analysis is not to determine which approach to hashing is most efficient, so we
are not doing empirical comparison of hashing alternatives. Instead, the purpose
is to analyze the proper loading factor that would be used in an efficient hashing
system to balance time cost versus hash table size (space cost).
Search WWH ::




Custom Search