Information Technology Reference
In-Depth Information
able to justify such a result, but no experiment will be so far-reaching. In any case, it
is rare for a data structure to be completely superseded—consider the durability of
arrays and linked lists—so in all probability this hypothesis is incorrect. A testable
hypothesis might be
As an in-memory search structure for large data sets, Q-lists are faster and
more compact than P-lists.
Further qualification may well be necessary.
We assume there is a skewaccess pattern, that is, that themajority of accesses
will be to a small proportion of the data.
The qualifying statement imposes a scope on the claims made on behalf of Q-lists.
A reader of the hypothesis has enough information to reasonably conclude that
Q-lists do not suit a certain application; this limitation does not invalidate the result,
but instead strengthens it, by making it more precise. Another scientist would be free
to explore the behaviour of Q-lists under another set of conditions, in which they
might be inferior to P-lists, but again the original hypothesis remains valid.
As the example illustrates, a hypothesis must be testable. One aspect of testability
is that the scope be limited to a domain that can feasibly be explored. Another,
crucial aspect is that the hypothesis should be capable of falsification. Vague claims
are unlikely to meet this criterion.
Q-list performance is comparable to P-list performance.
Our proposed query language is relatively easy to learn.
The exercise of refining and clarifying a hypothesis may expose that it is not
worth pursuing. For example, if complex restrictions must be imposed to make the
hypothesis work, or if it is necessary to assume that problems that are currently
insoluble must be addressed before the work can be used, how interesting is the
research?
A form of research where poor hypotheses seem particularly common is “black
box” work, where the black box is an algorithm whose properties are poorly under-
stood. For example, some research consists of applying a black-box learning algo-
rithm to new data, with the outcome that the results are an improvement on a baseline
method. (Often, the claim is to the effect that “our black box is significantly better
than random”.) The apparent ability of these black boxes to solve problems without
creative input from a scientist attracts research of low value. A weakness of such
research is that it provides no insights into the data or the black box, and has no
implications for other investigations. In particular, such results rarely tell us whether
the same behaviour would occur if the same approach were applied to a different
situation, or even to a new but similar data set.
That is, the results are not predictive . There may be cases in which it is interesting
to observe the behaviour of an algorithm on some data, but in general the point of
experimentation is to confirm models or theories, which can then be used to predict
future behaviour. That is, we use experiments to learn about more general properties,
a characteristic that is missing from black-box research.
Search WWH ::




Custom Search