Biology Reference
In-Depth Information
Millions of individual bacterial colonies are produced and individually placed in
multiwall plates by robotics to isolated individual DNA clones. This DNA then
goes through a sequencing reaction and the sequenced DNA undergoes capil-
lary electrophoresis where labeled nucleotides are collected and scanned by a
laser to produce sequencing reads. The raw data are then converted into com-
puter files showing the sequence and the quality of each base. The final data
are stored and released to public databases such as GenBank.
As a result of the industrialization of DNA sequencing by the Sanger method,
the cost decreased to US$0.20-0.30 per base to sequence the Drosophila
genome when the accuracy was held to less than one error in 10,000 bases.
During 2000, approximately one complete bacterial genome was obtained each
month. The Drosophila Genome Project was completed in fall 2000 ( Adams et al.
2000 ), and on June 26, 2000, a working draft of the human genome was com-
pleted at a cost of US$1 billion ( Bentley 2006 ). The year 2000 was called “The
Year of the Genome,” but was only the start of the genomics revolution. In
2006, the cost to sequence the human genome using the Sanger method would
take 30 instruments for 1 year to complete and would cost $US10 million
( Bentley 2006 ). As discussed in Section 7.11, newer sequencing methods have
reduced the costs and time to completion, and revolutionized genome sequenc-
ing, although they also made analysis more complex and time-consuming.
7.8 Analyzing DNA Sequence Data
Even small-scale DNA sequencing projects generate substantial amounts of data
and require computer assistance for their analysis ( Reese et al. 2000, Stein 2001,
Mount 2004, Hodgman et al. 2009, Pevsner 2009 ). Software packages are avail-
able for laboratory computer systems and, depending on the size of the com-
puter, can analyze the sequences in greater or lesser detail. DNA sequences
obtained from automated sequencing machines are provided online or on com-
puter disk.
Computer programs can compare reads from several sequencing runs, search
for and identify overlaps, compare results from sequencing the complementary
strands of the DNA, and identify possible errors. Once the sequences have been
entered into the computer, the next step is to analyze the data.
In a shotgun genome-sequencing project, the DNA is broken into fragments
that are cloned and sequenced. The relationships between the cloned frag-
ments are determined by comparing their sequences. DNA segments related
to one another by a partial overlap are called contigs . If a sequence overlaps
with another, then the two contigs can be joined. The process of comparing
Search WWH ::




Custom Search