Biology Reference
In-Depth Information
7.10.1 The Original
Drosophila
Genome Project
The original
Drosophila
Genome Project had the following aims:
1. Develop a high-resolution
physical map
that would serve as a basis for DNA
sequencing and detailed functional studies. A physical map is a series of
overlapping clones for which information is available on the sequences at
their ends and knowledge of their physical location on the chromosomes.
The physical map would be integrated in a database with cross-references to
the genetic information already available for
D. melanogaster
.
2. Conduct feasibility studies for large-scale DNA sequencing projects, especially
for regions containing DNA of great biological interest. Large-scale studies
were defined as those that attempted to determine three megabases (Mbp)
of contiguous DNA sequence within 3 years using the Sanger method.
3. Develop new bioinformatic techniques to identify coding sequences
in genomic DNA and to obtain high-quality cDNA libraries that were
representative of the complete coding information of the genomic DNA
(
Merriam et al. 1991
).
7.10.2 The Actual
Drosophila
Genome Project
The
Drosophila
Genome Project actually was completed much more quickly
and by a different strategy than originally planned (
Adams et al. 2000, Pennisi
2000a
).
Drosophila melanogaster
became only the second multicellular organism
(after the nematode
C. elegans
) to have its entire genome sequenced.
The initial
Drosophila
sequencing effort was initiated in 1990 and was only
partially completed when
Venter et al. (1996)
proposed using a “shotgun
strategy.” This was a novel approach to sequencing such a large genome and
involved breaking the entire genome into small pieces, sequencing them with
an array of very fast and expensive new Sanger sequencing machines, and then
using powerful supercomputers to assemble the sequenced fragments into the
correct order. A collaboration was undertaken by a company founded by Craig
Venter (Celera), the Berkeley
Drosophila Genome Project
, and its European
counterpart to guide the work and interpret the data.
Shotgun cloning had never been attempted previously with such a complex
genome. The complexity is due to the presence of hundreds to thousands of
repeated sequences that are scattered throughout the genome and cause prob-
lems in assembly of the sequence data. The solution was to obtain sequences
from
both ends
of fragments (paired ends) that were
≈
2, 10, and 150 kb. These
oriented bits of sequence were assembled into increasingly dense and inter-
linked scaffolds that generated long continuous stretches of DNA sequence