Biology Reference
In-Depth Information
rithm in the C language, which allows him much fi ner control
over memory access. Henry also spends a further two weeks
“profi ling” his code: he runs a program on his code while it is
running to analyze where it is spending the most time—which
lines are acting as bottlenecks. Once bottlenecks are identifi ed,
Henry works to speed up these parts of the code, perhaps in-
creasing its speed by a factor of 2 or 3 here and there. After
about eight weeks, with the revised algorithm now running in
C, Henry predicts that his code will run on the data set he has in
mind in about two or three CPU-months. On a powerful com-
puter cluster, this amounts to mere days. The problem is now
tractable, and Henry's work is done. 12
Both of these sets of activities are primarily computational, yet the
biology-trained student's approach differs markedly from the computer
science-trained student's approach. For the former, the computer pro-
grams are not ends in themselves—it is suffi cient that they work at all.
Biology-trained individuals are often “hackers,” piecing together results
by using whatever resources they have at hand (other people's code, the
Internet). The computer scientist, on the other hand, considers program-
ming an art: the fi nal product (the code) should provide an elegant solu-
tion to the problem at hand. This difference is manifest in the different
sorts of practices we see in this example: the biology trainee spends the
most time writing small chunks of code that help to order and make
sense of the data; the computer scientist is interested in the whole prod-
uct and spends the most time revising it over and over again to fi nd an
optimal solution to the problem. The status of wet biology is also un-
derstood very differently: for the biologically trained, the wet lab must
be the ultimate arbiter of any fi nding, whereas the computer scientist,
while acknowledging its importance, produces results that are designed
to stand on their own.
These differences in approach can be understood as two distinct at-
titudes toward data: for the computer scientist, the point of his or her
work is the elegant manipulation of data; for the biologist, the wet stuff
of biology takes precedence over data. Moreover, the computer scien-
tist's work is oriented toward mastering and digesting the whole quan-
tity of data, which comes down to a problem of managing the quantity
of data by increasing speed. The biologist, on the other hand, is more
concerned with the particularity of specifi c cases: it might take years of
work to validate every result shown by the data in the wet lab, so more
work must be done to narrow down the data to biologically plausible
Search WWH ::




Custom Search