Statistical Principles - Writing for Computer Science

Information Technology Reference

In-Depth Information

Statistical Tools

Statistical tools that have wide application in computer science research include

correlation, regression, and hypothesis testing. Measures of correlation are used to

determine whether two variables depend on each other. Regression is used to identify

the relationship between two variables. These can be used, for example, to determine

whether input size affects speed or whether light intensity affects object recognition.

Given the variability inherit in the experimental output, how do we know that the

results we observe are due to some real effect, and not just to chance? Understanding

this core question is fundamental to understanding not only which statistical tests

to use, but also how to design experiments, and what conclusions can be drawn

from them.

The principle concepts of statistical inference can be seen through a simple exam-

ple (which I present here in some detail, because these concepts are often misunder-

stood). Consider the experiment of trying to determine whether a coin is biased; that

is, whether the coin has a probability of coming up heads that is other than 50 %. Sup-

pose the coin is flipped 12 times, and on 9 times heads are observed. Taken naïvely,

the results of our experiment might suggest that coin is biased: three-quarters of the

flips have turned up heads. But even if the coin is unbiased, on any given sequence

of flips the proportion of heads may diverge from 50 %; any sequence of coin flips

is possible.

The question we have to ask instead is, if a coin is unbiased, how likely are we to

observe 9 heads or more from 12 flips? If this likelihood is sufficiently small, then we

can with confidence—though not with certainty—conclude that the coin is biased.

There are 2 12

4096 distinct sequences of toins cosses. The number that have

12 heads is 1; that have 11 heads is 12; that have 10 heads is 12

=

×

11

/

2

=

66;

and that have 9 heads is

(

12

×

11

×

10

)/(

3

×

2

) =

220. So there is a total of

220

299 ways of getting at least 9 heads. If the coin is unbiased,

then any given sequence of flips, such as hhththhtthth, is as likely as any other

sequence, even tttttttttttt. Therefore the probability of flipping 9 or more heads

with 12 flips of an unbiased coin is 299

+

66

+

12

+

1

=

3 %. A common experimental

protocol is to set a threshold of 5 % probability or less before we are confident

in a conclusion; 3 the probability here is slightly too high to confidently reject the

possibility that the coin is unbiased towards heads.

This example illustrates most of the important concepts behind statistical hypoth-

esis testing . The supposition that “the result was by chance”, is represented by our

null hypothesis—that the coin is truly unbiased. The result we are testing is stated

as the alternative hypothesis—that the coin has a positive bias. We then calculate

the likelihood of the observed or a more extreme result, of 9 or more heads, on the

assumption that the null hypothesis is true. This is known as a one-tailed test. For

/

4096

=

7

.

3 The question of whether and when this protocol is correct or appropriate is beyond the scope of

this topic. The use of thresholds and particular statistical tests is a continuing topic of scientific

debate, and methodologies continue to develop. What is clear is that some use of hypothesis testing

is clearly preferable to simple reporting of averages and claimed “improvements”.

Writing for Computer Science

Search WWH ::

Custom Search

Home