Experimentation - Writing for Computer Science

Information Technology Reference

In-Depth Information

tations of results, minutes of decisions and agreed actions, and so on. They allow

easy reconstruction of old research, and simplify the process of write-up. They are

particularly helpful if a paper is accepted after a long reviewing process and experi-

ments have to be freshly run; all too often the code no longer produces anything like

the original results, because too many details of the experiments have been lost and

the code has been modified. Most of all, notebooks keep researchers honest.

While there are obvious reasons to consider maintaining notebooks online, in my

experience written notebooks continue to be as effective, and they help provide a

physical sense of progress and achievement that is somehow lacking in an online

equivalent. In either form, it is good discipline to include dates, never change an

entry, and use the notebook as often as possible.

Another strategy that keeps researchers honest, and helps to describe and publicize

their work, is to make code and data available online. Doing so shows that you have

high confidence in the correctness of your claims. In an informal survey I undertook

in the 1990s, several computer scientists commented to me that they would not have

made some of the claims in their papers if they had had to publish their code or to run

their experiments under external scrutiny. More positively, publishing code reduces

the barrier to entry for other researchers, and helps to establish baselines against

which new work should be measured.

Part of reporting of experiments is description of the data that was used. Typically,

readers need to know how the data was gathered or created; how your version of the

data might be obtained, or recreated; what the shortcomings of the data are, that is

in what ways it might be uncertain, incomplete, or unreliable; and what aspects of

the research question are not tested by the data. Observe that raw data and massive

listings of intermediate outcomes are not in this list!

An “Experimentation” Checklist

Regarding the design of the experiments ,

Have appropriate baselines been identified? What makes them appropriate? Are

they state-of-the-art?

What data has to be gathered, and where from?

How will readers gather comparable data for themselves?

Is the data real? Is it sufficient in volume? What validation is required for artificial

data?

Should the data be seeded with examples to test the validity of the outcomes?

Is there reference data for the problem, and what are its limitations?

Will a domain expert be needed to interpret the results?

What are the likely limitations on the results?

Should the experimental results correspond to predictions made by a model?

Will the reported results be comprehensive or a selection? Will the selection be

representative?

Search WWH ::

Custom Search

Home