Biomedical Engineering Reference
In-Depth Information
Database Technology
The purpose of a database is to facilitate the management of data, a process that depends on people,
processes, and as described here, the enabling technology. Consider that the thousands of base pairs
discovered every minute by the sequencing machines in public and private laboratories would be
practically impossible to record, archive, and either publish or sell to other researchers without
computer databases. At the current stage of database technology evolution, bioinformatics databases
are housed on large hard drives in locker- or refrigerator-sized local servers and online sequence
databases such as GenBank. Thanks to modern computer technology, a modern bioinformatics
researcher can compare and contrast the genomes of a dozen species while sitting on the beach with
a laptop computer connected through a wireless modem to the Internet. While this image makes for
good advertising copy, in practice, most researchers are tied to wet laboratories that generate,
manipulate, and store vast quantities of experiment-specific data. In this context, the database
technology empowers researchers to store their data in a way that it can be quickly and easily
accessed, manipulated, compared to other data, and shared with other researchers.
The concept of a database is necessarily colored by the current state of the technology. Just as a
state-of-the-art bioinformatics workstation, operating at Gigahertz clock speeds with a gigabyte or
more of RAM and banks of hundred-gigabyte hard drives, would easily outperform one of the early
supercomputers, database technology is constantly evolving. Within our lifetimes, the contents of
GenBank will easily fit into the working memory of a handheld computer, and our concept of what
constitutes a "large" database will have to be adjusted accordingly. Even so, there is more to the
concept of a database—whether it's referred to as a repository, data warehouse, data mart, or local
database—than raw capacity.
The volatility of the data, the concept of working memory, and the interrelatedness of data,
regardless of the volume of data involved, are distinguishing features of the various forms of memory
systems or databases. For example, from the perspective of working memory, the function of a data
warehouse is to move data from a variety of sources and prepare the data for incorporation into
working memory. Similarly, a data warehouse or other database is distinguished from an archive in
that the data in an archive are much further removed from working memory. An archive might be
stored on optical platters, magnetic tapes, or other media that is held in an offsite fireproof safe or
underground building. Furthermore, the archive is typically engineered for longevity and the ability to
be reconstituted, and not for speed of access. A database, in contrast, is a live, working system that
forms the centerpiece for biotech R&D activities.
Functionally, the relationship between various database technologies can be compared to the
information stored in the body, as depicted in Figure 2-11 . Just as it's inefficient to have papers
strewn about an office, out of order, difficult to identify, and distracting the user's attention from the
documents that should be addressed, our genetic information is stored in the genome, tightly
packed, out of harm's way, and yet accessible. The data are there, as in an archive, but not
immediately available. Focusing on the individual chromosomes, data are more readily available, but
still packed away so that they don't interfere with cellular processes. As subsets of data are moved
out of the chromosome to the work environment, through the process of transcription, data are more
readily available for use. Finally, at the translation stage, the data serve as the basis for the current
work (as data do for computer applications), whether creating proteins according to the Central
Dogma, or attempting to locate a matching gene in a pattern-matching application.
Figure 2-11. Organic Analog of Database Hierarchy. The database hierarchy
has many parallels to the hierarchy in the human genome. Data stored in
chromosomes, like a data archive, must be unpacked and transferred to a
more immediately useful form before the data can be put to use.
 
 
Search WWH ::




Custom Search