Biomedical Engineering Reference
In-Depth Information
This biological information is quite data-intensive; the DNA of a single
human contains about 6 gigabits of information, and the number of genes that
potentially may be expressed may total approximately 30,000 (up to 15,000
genes may be expressed in each particular cell type, and there are thousands of
cell types). The DNA of a single individual contains about 3 + 10 9 bases, which
(with 4 bases) is 6 + 10 9 bits. The DNA of a million individuals (e.g., a large
military force) therefore requires 6 pedabits (a pedabit is 10 15 bits). The expres-
sion information for a few dozen cell types in each of a million individuals may
also require multiple pedabits. Although the acquisition of such a vast DNA
databank may be feasible via standard biotechnology, the rapid transfer of the
DNA of such a large number of individuals into digital media seems infeasible,
due to the tedious and time-consuming nature of DNA sequencing. Even if this
large amount of information could be transferred into digital media, it certainly
would not be compact: current storage technologies require considerable volume
(at least a few dozen cubic meters) to store a pedabit. Furthermore, even simple
database operations on such a large amount of data require vast computational
processing power (if executed in a few minutes).
1.2. Overview of the Biomolecular Database System
This chapter presents the architecture of a Biomolecular Database system
for the efficient storage, processing, and retrieval of genetic information and
material. It completely bypasses the usual transformation from biological mate-
rial (genomic DNA and transcribed RNA) to digital media, as done in conven-
tional bioinformatics. Instead, biotechnology techniques provide the needed
capability of a Biomolecular Database system, without ever transferring the bio-
logical information into a digital media. It may provide a potentially unique and
revolutionary capability in genomics.
1.2.1. DNA: An Ultra-Compact Storage Media
The storage media of this database system is comprised by the strands of
DNA, which are (in comparison to RNA) relatively stable and non-reactive: they
can be stored for a number of years without significant degradation. In particu-
lar, the genetic information can be stored in the form of DNA strands containing
fragments of genomic DNA as well as appended strands of synthesized DNA
("information tags") encoding information relevant to the genomic DNA. This
Biomolecular Database is capable of containing a vast store of genomic DNA
obtained from many individuals (e.g., multiple divisions of an army). We can
provide the store with a redundancy (i.e., a number of copies of each DNA in the
database) that ranges from a few hundred or thousand downwards to perhaps 10,
Search WWH ::




Custom Search