Statistical Principles - Writing for Computer Science

Information Technology Reference

In-Depth Information

a sample of documents, average may well be appropriate; but for evaluating typical

network delay for a round-trip of a packet, average may well be meaningless. First,

some delays are effectively infinite (the packet is lost). Second, the distribution of

such delays often consists of a large number of fast responses and a small number

of extremely slow responses; the average is therefore somewhat slower than the fast

times, but in a range where no values were observed at all. An analogy is averaging

the duration of a plane flight and of a car journey from Paris to Moscow, and stating

that this middle value is a typical travel time—although it would never be observed

in practice.

A consequence of this reasoning is that there are cases where the maximum or

the minimum may be the best value to report. For example, the time taken for a

distributed system to process a problem may vary a little depending on a range of

variables, all of which have the effect of interfering with the system. The minimum

time represents the most pure run, in which the system has had the least additional

work. Thus it may be appropriate to report the fastest time observed, while noting

the variance.

Reporting of Variability

Averaging provides valuable insight into typical behaviour, but it is often also appro-

priate to report variability. It is particularly important for determining statistical

significance, as described later; but even a simple analysis of variability can deepen

analysis of results.

In a paper where the authors examined how efficiently a particular distributed

system (of 64 remote servers connected with a novel virtual topology) could process

relational operations such as joins and sorts, they followed an appropriate method-

ology in which the size of the tables being used as input was varied, 2

and reported

on how the elapsed time varied with problem size.

However, they dutifully reported “surprising” results such as that the method was

faster for 4,000,000 records than for 3,000,000 records, and elaborately speculated

as to whether the topology somehow led to less contention for certain ordinal sizes of

problem. It turned out that they had reported averages over a few runs for each size—

and that in some cases the average included wild outliers, perhaps due to other traffic

on the network, where one run had been 10 or even 100 times slower than the others.

Their incomplete reporting had meant that they had significantly misinterpreted the

results.

2 The pattern of size variation was not well chosen, though. As is a common practice, they increased

the size of the tables linearly, in this case from 1,000,000 records to 10,000,000 records, in increments

of 1,000,000. However, they used this result to make claims about scaling—although only one

(decimal) order of magnitude was present. The result would have been more impressive if they had

incre ase d the size geometrically, from say 10,000 records to 100,000,000 records, by a factor of 10

or √ 10 at each step. A logarithmic graph of size versus of time would have clearly demonstrated

a trend.

Search WWH ::

Custom Search

Home