Information Technology Reference
In-Depth Information
outcomes. And if increasing a value has an effect, what happens with a further
increase? That is, you should explore factors broadly enough to make trends and
patterns clear.
Consider the difficulties in measuring how the performance of a distributed algo-
rithm changes as the number of servers is increased. Suppose the volume of data is
fixed. As the number of servers is increased, it may be that at some point the data on
each machine can be held wholly in memory (because the per-machine volume of
data is falling), so that disk is not required. The experiment might then show a dra-
matic increase in performance; but this would in fact be due to the reduced problem
size, not to the power of distribution. On the other hand, if the data volume is scaled
with number of machines, then the characteristics of the data set as a whole may be
altered. In some cases there is no straightforward resolution to such issues.
At the start of Chap. 4 , I introduced an example of a research program that was
designed to investigate the impact of cache on the relative performance of two data
structures. This example illustrates some of the difficulties in producing a robust,
persuasive experimental design.
First, my proposal was that performance be assessed by counting cache misses
(an operation that may be supported in hardware, or may involve use of a simulator).
This approach produces evidence that is clearly related to the hypothesis, which
concerns cache behaviour, but does not directly demonstrate an improvement in
computational efficiency. 3 Second, the relative behaviour will depend on the data
volumes. For example, it may be that, as the amount of data is increased, the costs
for the tree-based algorithm grow slowly until some thresholds are exceeded, then
grow dramatically as just a little more data is introduced, and then grow slowly
again. Third, behaviour with growth might be obvious for some data sets, and less
conspicious for others, depending on the pattern of accesses.
To produce strong experimental designs for this problem, researchers needs to
use their understanding (that is, models, formal or otherwise) of the algorithms to
anticipate whether issues of robustness will arise, and then to choose measures and
data that have the potential to expose inconsistencies between the models and the
behaviour is practice.
Performance of Algorithms
As an example of experimental design, consider potential approaches to assessment
of the performance of algorithms. The tools, as discussed in Chap. 4 , are formal proof,
mathematical modelling, simulation, and experimentation. How these are used flows
3 Is computational efficiency even well defined? Is it the number of instructions used, say, or the
number of seconds each CPU core spends on the process? These are no longer equivalent. If the
core is otherwise idle, due to memory access delays, is processing time relevant at all? The answers
to these issues will depend on the research question.
 
Search WWH ::




Custom Search