Biology Reference
In-Depth Information
complex relationships between sequences and pieces of sequences based
on maps or alignments, 87 but also provided sophisticated and robust
links to published scientifi c articles. Citations in various databases were
mapped to MEDLINE via unique integer identifi cation numbers. Ap-
propriate software could rapidly search (across multiple databases) for
objects that cited the same article and link those objects together, or it
could go even further and make links based on keywords from the ab-
stracts contained in MEDLINE. By rendering the model in ASN.1, the
NCBI created a system that combined objects (DNA sequences, protein
sequences, references, sequence features) from a variety of databases
and manipulated them all with a common set of software tools.
DNA-centered relational databases provided more fl exible ways to
recombine and reorder sequences. ASN.1 and the data model permit-
ted no static biological objects. Rather, it was assumed that the process
of doing biology would involve recombination and reordering of dif-
ferent biological objects across a wide range of databases. Relational
databases were a framework within which to investigate the properties
of dynamically rearrangeable sequence elements. The data model was a
framework within which to investigate genomes using a wide variety of
other data and data types.
The data model has provided a framework for exemplary experi-
ments of the postgenomic era. Although it was developed in 1990, it
remains a powerful tool for moving biological investigation beyond the
genome. As biologists began to realize the limitations of studying the
genome in isolation, the data model demonstrated ways in which to
integrate more and more kinds of biological data.
In 2005, the bioinformatician Hans P. Fischer called for “inventoriz-
ing biology”—capturing the entirety of information about an organism
in databases. Genomes, transcriptomes, proteomes, metabolomes, inter-
actomes, and phenomes should be characterized, entered into databases,
and integrated. This new “quantitative biology” would transform drug
discovery and allow us to understand human disease pathways. This
vision of “tightly integrated biological data” would allow an engineer-
ing-like approach to biological questions—drug design or even under-
standing a disease would become more like building an aircraft wing. 88
In the postgenomic era, the organization and integration of biological
information provides a structure or blueprint from which biologists can
work. At the beginning of each year, Nucleic Acids Research publishes
a “database issue” that provides an inventory of biological databases.
In 2009, that list included 1,170 databases, including about 100 new
entries. 89 The ways in which the information in those databases is con-
Search WWH ::




Custom Search