Information Technology Reference
In-Depth Information
from questions about what the assessment of the algorithm is intended to achieve,
and what the likely limitations of and contraints on the experiment are, as follows.
Basis of evaluation . The basis of evaluation should be made explicit. Where algo-
rithms are being compared, specify not only the environment but also the criteria used
for comparison. For example, are the algorithms being compared for functionality
or speed? Is speed to be examined asymptotically or for typical data? Is the data real
or synthetic? When describing the performance of a new technique, it is helpful to
compare it to a well-known standard.
A comparison should have a realistic basis. In particular, the basis should not
appear to favour the algorithm being demonstrated over existing algorithms. If the
basis of comparison is questionable, the results are questionable too.
Simplifying assumptions can be used to make mathematical analysis tractable, but
can give unrealistic models. Non-trivial simplifications should be carefully justified.
Processing time . Time (or speed) over some given input is one of the principal
resources used by algorithms; others are memory, disk space, and disk and network
traffic. Time is not always easy to measure, since it depends on factors such as CPU
speed, cache sizes, system load, and hardware dependencies such as prefetch strategy.
Times based on a mathematical model rather than on experiment should be clearly
indicated as such.
Measurements of CPU time can be unreliable. CPU times in most systems are
counted as multiples of some fixed fraction of a second, say a sixty-fourth or a
thousandth. Each of these fractions of time is allocated to a process, often by heuristics
such as simply choosing the process that is active at that moment. Thus the reported
CPU time for a process may be no more than a good estimate, particularly if the
system is busy.
Memory and disk requirements . It is often possible to tradememory requirements
against time, not only by choice of algorithm but also by changing the way disk is
used and memory is accessed. You should take care to specify how your algorithms
use memory.
Disk and network traffic . Disk costs have two components, the time to fetch the
first bit of requested data (seek and latency time) and the time required to transmit
the requested data (at a transfer rate). Thus sequential accesses and random accesses
have greatly different costs. For current hardware, in which there are several levels
of caching and buffering between disk and user process, it may also be appropriate
to consider repeat accesses, in which case there is some likelihood that the access
cost will be low. The behaviour of network traffic is similar—the cost of transmitting
the first byte is greater than the cost for subsequent bytes, for example.
Because of the sophistication of current disk drives and the complexity of their
interaction with CPU and operating system, exact mathematical descriptions of algo-
rithm behaviour are unattainable; broad approximations are often the only manage-
able way of describing disk performance.
Search WWH ::




Custom Search