Information Technology Reference
In-Depth Information
first access is slow, because locating the first block of a file requires that header blocks
be fetched first; but subsequent accesses to the same block are fast, because in all
likelihood it will be cached in memory. Some deviousness is required to ensure that
averages over a series of runs are realistic. Now consider a more typical experiment:
real time taken to evaluate queries to a database system. If the queries are poorly
chosen, the times will vary because of the caching and the complex ways in which
the system components interact, and multiple runs will not give realistic figures.
Now consider another elementary experiment—comparing the speed of two algo-
rithms for the same task. The implementations (new and old)tobeusedforthe
experiments take the same input and produce the same output, and thus are exter-
nally indistinguishable. The algorithms are run in turn on the same data, and it is
observed that new is faster than old by several percent.
It would be easy to conclude that new is the better algorithm. However, on the
evidence so far, it would also be possible to conclude that:
new is better implemented than was old. After all, new is your invention, and it
is reasonable to take the care that is necessary to ensure that it runs well. Perhaps
the same care was not taken with old.
old uses more buffer space than new, leading to poor behaviour on this particular
computer. The same results might not be observed on another machine.
old uses floating-point operations that are not supported in hardware.
At compile-time, old was accidentally built with debug options enabled, slowing
it down.
Inaccuracies in the timing mechanism randomly favoured new. Although, for
example, Unix-style timing mechanisms can return values in nanoseconds, their
accuracy below tenths of a second is often questionable.
old was run first, and was delayed while the input was copied to memory; new
accessed the input directly from cache.
The particular input chosen happened to favour new.
Further experiments are required to distinguish which of these conclusions is most
likely to be correct.
Some such effects are random, some are systematic—the same wrong measure-
ment is recorded every time the experiment is run. In work on text indexes some years
ago, we had some deeply puzzling results. The first stage went exactly as predicted:
we built an index for a small text collection (250 megabytes, in an era when the
capacity of a typical disk drive was a gigabyte or so) and tested a heuristic improve-
ment to the query evaluation algorithm. The test showed that the new method was
about twice as fast as the old. But how would it scale? My research assistant then
built an index for a gigabyte of data, and ran the same queries. The same result was
observed, with the new method about twice as fast as the old; but the queries on a
gigabyte used only 65 % of the time needed for a quarter gigabyte.
Considering the detail of the experiment, this result made no sense at all. Two
quantities were independent of scale—the number of documents fetched and the total
number of disk accesses—but the index size scaled linearly with data volume, and
other measurements showed that four times as much index was being processed; yet
Search WWH ::




Custom Search