Information Technology Reference
In-Depth Information
use of standardized resources means that there is direct control of the principal vari-
ables, and experiments are comparable between research groups; existing published
results provide a baseline against which new results can be directly compared.
By the standards of computer science, the TREC experiment is expensive, with, for
example, some months of assessor time required every year. However, TREC illus-
trates that robust experiments can have high impact. When TREC began (in 1992, a
year or two before the Web began to be significant), there was a large range of com-
peting theories about the best way to match documents to queries. Weak methods
were rapidly culled by TREC, and a great many dramatic improvements in informa-
tion retrieval were spurred by the opportunity that TREC created. The Web search
engines drew substantial inspiration from the TREC work and, in contrast to some
other areas of computer science, the links between academia and industry remain
strong. This impact could not have been achievedwithout the large-scale involvement
of human assessors, or without the commitment to robust experimentation.
Coding for Experimentation
In computer science research, in principle at least, the sole reason for coding is to
build tools and probes for generating, observing, or measuring phenomena. Thus the
choice of what to measure guides the process of coding and implementation—or,
perhaps, indicates what does not have to be coded.
The basic rule is to keep things simple. If efficiency is not being measured, for
example, don't waste time squeezing cycles from code. If a database join algorithm
is being measured, it may not necessary to implement indexes, and it is almost
certainly unnecessary to write an SQL interpreter. All too often, computer scien-
tists get distracted from the main task of producing research tools, and instead, for
example, develop complete systems.
In coding for an experiment, there are several other such rules or guidelines that
might seem obvious, but which are often not followed. Examples include:
￿
One task, one tool: decompose the problem into separate pieces of code. In most
cases, trying to create a single piece of code that does everything is just not pro-
ductive. Do you need to integrate the data classifier into the network generator, and
the network generator into the visualizer? Wouldn't it have been easier to develop
them independently and combine them with a script?
￿
Be aware that you may need to trade ease of implementation against realism of the
result. Can load balancing across distributed machines on a network be examined
without development of significant software infrastructure? Can an algorithm be
assessed if all data is held in memory, or is it necessary, for realism, to manage data
on disk, perhaps in a custom-built file system?Hard-coding of data structures, input
formats, and so on, may allow for rapid implementation; does it lead to unrealistic
behaviour or simplifications?
￿
Cut the right corners. Coding for a day to save an hour's manual work is a waste
of time, even if coding is the more principled approach. But coding for an hour
 
Search WWH ::




Custom Search