Making Knowledge - Life Out of Sequence

Biology Reference

In-Depth Information

signifi cant way. One way was to create a “decoy” set of exon scrambling

events: an imaginary set of scrambling events of the same size as the

actual set, but distributed randomly across exons for the given genes.

The idea was that if a biological effect showed up in the true set of

scrambling events and not in the decoy set, then I was observing a real

effect; if it showed up in both, something else, probably nonbiological,

was going on.

At the end of my i eldwork, when I presented my i ndings at the

weekly lab meeting, most of the criticisms I received related to statis-

tical shortcomings. Indeed, the suggestions for improving my analysis

centered on ways to make better statistical models of scrambled exons

in order to differentiate signal from noise. In other words, it was not

so much that my work had left room for other plausible biological ex-

planations as that it had not ruled out other possible statistical expla-

nations. This example suggests that bioinformatics entails new criteria

for evaluating knowledge claims, based on statistical, rather than direct

experimental, evidence. But it also shows the importance of the com-

puter as a tool for rearranging objects, rapidly bringing them into new

orders and relationships: mixing up, sorting, and comparing chunks of

sequence.

This project, as well as other work I observed in the lab, can also be

considered a form of simulation: scrambled exons could be considered

“real” events only when a model (the decoy set, for instance) failed

to produce the observed number of scrambling events. This process is

remarkably similar to the accounts of particle physics experiments re-

ferred to in chapter 1—there too, a simulation of the background is

generated by the computer and the particle is said to be really present

if the observed number of events is signifi cantly above that predicted

by the simulation. 28 Early in my i eldwork, I jotted down the following

observation:

The work in the lab seems to me to have a lot of the messi-

ness of other lab work. Just as you can't see what proteins or

molecules are doing when you are working with them on a lab

bench, in this work one cannot see all your data at once, or re-

ally fully understand what the computer is doing with it. There

are analogous problems of scale—one realm is too small, the

other too informatically vast for you to try to manipulate indi-

vidual pieces. This means that you have to try to engineer ways

to get what you want out of the data, to make it present itself

in the interesting ways, to fi nd the hidden pattern, just as you

Search WWH ::

Custom Search

Home