Information Technology Reference
In-Depth Information
followed by execution for a month is a lot less efficient than coding for a day
followed by execution for 20min—especially if you have to run the experiment
again.
￿
Use the right tool, not themost convenient tool. For example, if you plan tomeasure
algorithmic efficiency, implement in a suitable language.
￿
Don't re-code unnecessarily. Use libraries; is it really necessary to do a fresh
implementation of a B-tree?
￿
Find an independent way of verifying that the output is correct. If you think you've
successfully compressed some data, prove it: write a working decompressor.
￿
For long-running processes, consider developing mechanisms for periodically sav-
ing state, so that weeks (or more) of work isn't lost.
In environments such as the Unix family of operating systems, a program is often
tested by being run from the command-line, with output directed to the screen.
Parameters may be passed in as arguments, but to simplify coding they may be
defined as constants within programs. All too often, though, a researcher discovers
that an experiment run in this way cannot be repeated a day or two later.
A more reliable, repeatable approach is to run all experiments from scripts. Para-
meter settings are captured within the script; the settings used last time can be com-
mented out. Output from the script can be directed to a logfile and kept indefinitely.
If the output is well designed, it should include information such as input file names,
code versions, parameter values, and date and time.
Using simple Unix tools, it is straightforward to take data directly from a log file
and produce a graph or other summary of the results. These steps too can be encoded
in a script; the process for completing any stages undertaken by hand may well be
forgotten if the work is rested for a fewmonths, such as while a paper is under review.
A corollary is that the output of your code should be amenable to scripting, with, for
example, consideration given to consistent use of fields in each line of output from
experiments.
A practical consideration is whether the experiments are feasible at all. Experi-
ments can require storage of large volumes of data; implementation of production-
quality code; execution over months, with repetitions after failure; access to partic-
ular machines or configurations; use of humans for evaluation of results; access to
restricted data sets; use of particular pieces of software; or most of these things at
the same time. Before proceeding too far with a research question, you need to be
confident that you will have the resources required to undertake the experiments that
are needed for a persuasive outcome.
Describing Experiments
Your interpretation and understanding of the results can be as important as the results
themselves. When describing the outcomes of an experiment, don't just compile
dry lists of figures or a sequence of graphs. Analyze the results and explain their
 
Search WWH ::




Custom Search