Biology Reference
In-Depth Information
comparing sequences with one another was really a problem of fi nding
a “metric space of sequences.” 49
Under the supervision of Goad, a group of young physicists—includ-
ing Temple Smith, Michael Waterman, Myron Stein, William A. Beyer,
and Minoru Kanehisa—began to work on these problems of sequence
comparison and analysis, making important advances both mathemati-
cally and in software. 50 T-10 fostered a culture of intense intellectual ac-
tivity; its members realized that they were pursuing a unique approach
to biology with skills and resources available to few others. 51 Within the
group, sequence analysis was considered a problem of pattern matching
and detection: within the confusing blur of As, Gs, Ts, and Cs in a DNA
sequence, lay hidden patterns that coded for genes or acted as protein-
specifi c binding sites. Even the relatively short (by contemporary stan-
dards) nucleotide sequences available in the mid-1970s contained hun-
dreds of base pairs—far more than could be made sense of by eye. As
a tool for dealing with large amounts of data and for performing sta-
tistical analysis, the computer was ideal for sequence analysis. 52 Goad's
earlier work in physics and biology had used computers to search for
statistical patterns in the motion of neutrons or macromolecules; here,
also by keeping track of large amounts of data, computerized stochas-
tic techniques (e.g., Monte Carlo methods) could be used for fi nding
statistical patterns hidden in the sequences. As the Los Alamos News
Bulletin said of Goad's work on DNA in 1982, “Pattern-recognition
research and the preparation of computer systems and codes to simplify
the process are part of a long-standing effort at Los Alamos—in part
the progeny of the weapons development program here.” 53 Goad's work
used many of the same tools and techniques that had been developed
at the laboratory since its beginnings, applying them now to biology
instead of bombs.
The Los Alamos Sequence Database—and eventually GenBank—
evolved from these computational efforts. For Goad, the collection of
nucleotide sequences went hand in hand with their analysis: collection
was necessary in order to have the richest possible resource for ana-
lytical work, but without continuously evolving analytical tools, a col-
lection would be just a useless jumble of base pairs. In 1979, Goad
began a pilot project with the aim of collecting, storing, analyzing, and
distributing nucleic acid sequences. This databasing effort was almost
coextensive with the analytical work of the T-10 group: both involved
using the computer for organizing large sets of data. The techniques of
large-scale data analysis required for sequence comparison were very
Search WWH ::




Custom Search