Biology Reference
In-Depth Information
signifi cant way. One way was to create a “decoy” set of exon scrambling
events: an imaginary set of scrambling events of the same size as the
actual set, but distributed randomly across exons for the given genes.
The idea was that if a biological effect showed up in the true set of
scrambling events and not in the decoy set, then I was observing a real
effect; if it showed up in both, something else, probably nonbiological,
was going on.
At the end of my i eldwork, when I presented my i ndings at the
weekly lab meeting, most of the criticisms I received related to statis-
tical shortcomings. Indeed, the suggestions for improving my analysis
centered on ways to make better statistical models of scrambled exons
in order to differentiate signal from noise. In other words, it was not
so much that my work had left room for other plausible biological ex-
planations as that it had not ruled out other possible statistical expla-
nations. This example suggests that bioinformatics entails new criteria
for evaluating knowledge claims, based on statistical, rather than direct
experimental, evidence. But it also shows the importance of the com-
puter as a tool for rearranging objects, rapidly bringing them into new
orders and relationships: mixing up, sorting, and comparing chunks of
sequence.
This project, as well as other work I observed in the lab, can also be
considered a form of simulation: scrambled exons could be considered
“real” events only when a model (the decoy set, for instance) failed
to produce the observed number of scrambling events. This process is
remarkably similar to the accounts of particle physics experiments re-
ferred to in chapter 1—there too, a simulation of the background is
generated by the computer and the particle is said to be really present
if the observed number of events is signifi cantly above that predicted
by the simulation. 28 Early in my i eldwork, I jotted down the following
observation:
The work in the lab seems to me to have a lot of the messi-
ness of other lab work. Just as you can't see what proteins or
molecules are doing when you are working with them on a lab
bench, in this work one cannot see all your data at once, or re-
ally fully understand what the computer is doing with it. There
are analogous problems of scale—one realm is too small, the
other too informatically vast for you to try to manipulate indi-
vidual pieces. This means that you have to try to engineer ways
to get what you want out of the data, to make it present itself
in the interesting ways, to fi nd the hidden pattern, just as you
Search WWH ::




Custom Search