Biology Reference
In-Depth Information
not simply a question of “opening up a black box” or doing more work
to fi ll in the blanks: starting to explore the data by hand or beginning
to analyze a piece of software line by line is usually a practically impos-
sible task. 30 This ignorance has many sources—it may come from the
proprietary nature of software or lab instruments, from the sheer size of
data sets, or from inherent uncertainties in an experimental technique. 31
In bioinformatics, though, this ignorance is controlled through a specifi c
process of computational modeling: sanity checks or simulations fi ll in
the gaps. It is through these specifi c computational techniques that igno-
rance is controlled and trust in computational results is produced.
This kind of knowledge production marks a signifi cant break with
pre-informatic biology. The general questions that bioinformatics poses
can be addressed by using computers as statistical and simulation tools;
the management of large amounts of data enables bioinformaticians
not only to ask big questions, but also to use the computer as a specifi c
type of data reduction instrument. Bioinformatic biology is not distinct
simply because more data are used. Rather, the amounts of data used re-
quire distinct techniques for manipulating, analyzing, and making sense
of them. Without these computerized statistical techniques, there would
be no data—they would simply not be collected because they would be
worthless. These techniques make it possible not only to aggregate and
compare data, but to parse, rearrange, and manipulate them in a variety
of complex ways that reveal hidden and surprising patterns. Computers
are required not only to manage the order of the data, but also to man-
age their disorder and randomness.
Hypothesis and Discovery
In the last decade, several terms have arisen to label the kind of work
I am describing: “hypothesis-free,” “data-driven,” “discovery,” or “ex-
ploratory” science. A traditional biologist's account of knowledge mak-
ing might come close to Karl Popper's notion of falsifi cation: 32 theory
fi rmly in mind, the biologist designs an experiment to test (or falsify)
that theory by showing its predictions to be false. However, this hy-
pothetico-deductive method can be contrasted with an older notion of
inductive science as described by Francis Bacon. According to Bacon,
scientifi c inquiry should proceed by collecting as many data as possible
prior to forming any theories about them. This collection should be per-
formed “without premature refl ection or any great subtlety” so as not to
prejudice the kinds of facts that might be collected. 33
Bioinformatics can be understood in these terms as a kind of neo-
Search WWH ::




Custom Search