Information Technology Reference
In-Depth Information
run on different sets of data; or whether the aim is evaluate the behaviour of one
system, or compare two.
Some researchers are deterred from considering statistics because of the math-
ematical analysis that may be involved. However, first, there are packages that do
much of the hard work. Second, many statistical problems can be couched in terms
of elementary probability and then resolved computationally. For example, consider
the problem of identifying the likelihood that a particular tennis player will win a
match, given that this player has an independent probability of 60 % of winning each
point. The rules are: a game is won when either player has at least four points and
a two-point advantage; a set is won when either player has at least six games and a
two-game advantage; a match is won when either player has a two-set advantage, or
by the winner of the fifth set. Determining the probability of a win is non-trivial for
the statistically innocent.
However, it is easy to write a program that runs a series of matches with a simple
random-number generator for determining who wins each point; over a few thousand
matches the average converges on a reliable value. A more computationally sensible
option is to run a large series of trials to determine the likelihood of winning a game,
then use this value as input to determination of likelihood of winning a set, and so
on. It is not difficult to check that such code provides reasonable answers and that
the values do indeed converge. 6 Many statistics can be estimated in this way; it was
used, for example, to confirm the outcomes of the string-hashing experiments and to
confirm the theoretical predictions for an ideal hash function, where random numbers
and 32-bit pieces of pi were used in place of hash values.
Randomness and Error
In some circumstances it is possible for an experiment to succeed, or at least appear
to succeed, by luck; there might be an atypical pattern to the data, or variations
in system response might favour one run over another. Where such variations are
possible, many runs should be made, to reduce the probability of accidental success
and (in the statistical sense) to give confidence in the results. This is particularly
true for timings, which can be affected by other users, system overheads, inability of
most operating systems to accurately allocate clock cycles to processes, and caching
effects.
For example, consider the apparently simple experiment of measuring how fast a
block can be accessed from a file stored on disk. Under a typical operating system, the
6 A typical guess of the likelihood of the better player winning the match is 90 or 95 %; in fact, the
likelihood is close to certainty.
When I once suggested to students that they test the code by running it with a probability of
50 % of winning each point, several argued strongly that the program wouldn't terminate—which
is more or less equivalent to arguing that, when tossing coins, you can't get some given number of
heads in a row. They had confused the short-term variability (any number of consecutive throws of
a head will come up eventually) with long-term averages. Such are the pitfalls of intuition.
 
Search WWH ::




Custom Search