Biomedical Engineering Reference
In-Depth Information
scan horizontally across the base of a slab gel, many
separate sequences can be scanned, one sequence
per lane. Because the different fluorophores affect
the mobility of fragments to different extents, sophist-
icated software is incorporated into the scanning
step to ensure that bands are read in the correct
order. A simpler method is to use only one fluoro-
phore and to run the different chain-terminating
reactions in different lanes.
For high-sensitivity DNA detection in four-colour
sequencing and high-accuracy base calling, one
would ideally like the following criteria to be met:
each of the four dyes to exhibit strong absorption at a
common laser wavelength; to have an emission
maximum at a distinctly different wavelength; and
to introduce the same relative mobility shift of the
DNA sequencing fragments. Recently, dyes with
these properties have been identified and success-
fully applied to automated sequencing (Glazer &
Mathies 1997).
Automated DNA sequencers offer a number of
advantages that are not particularly obvious. First,
manual sequencing can generate excellent data, but
even in the best sequencing laboratories poor
autoradiographs are frequently produced that make
sequence reading difficult or impossible. Usually the
problem is related to the need to run different termina-
tion reactions in different tracks of the gel. Skilled
DNA sequencers ignore bad sequencing tracks, but
many laboratories do not. This leads to poor-quality
sequence data. The use of a single-gel track for all
four dideoxy reactions means that this problem is
less acute in automated sequencing. Nevertheless, it
is desirable to sequence a piece of DNA several times
and on both strands, to eliminate errors caused by
technical problems. It should be noted that long
runs of the same nucleotide or a high G+C content
can cause compression of the bands on a gel, neces-
sitating manual reading of the data, even with
an automated system. Note also that multiple,
tandem short repeats, which are common in the
DNA of higher eukaryotes, can reduce the fidelity
of DNA copying, particularly with Taq DNA poly-
merase. The second advantage of automated DNA
sequencers is that the output from them is in
machine-readable form. This eliminates the errors
that arise when DNA sequences are read and tran-
scribed manually.
A third advantage derives from the new genera-
tion of sequencers that have been introduced
recently. In these sequencers, the slab gel is replaced
with 48 or 96 capillaries filled with the gel matrix.
The key feature of this system is that the equipment
has been designed for use with robotics, thereby
minimizing hands-on time and increasing through-
put. With a 96-capillary sequencer, it is possible to
sequence up to 750 000 nucleotides per day.
Sequence accuracy
As part of a programme to sequence a 96 kb stretch
of mouse DNA, Wilson et al. (1992) analysed 288
sequences containing part of the vector DNA. By
comparing raw sequence data with known vector
sequences, it was possible to calculate the error
frequency. Sequences that were read beyond 400 bp
contained an average of 3.2% error, while those less
than 400 bp had 2.8% error. At least one-third of
the errors were due to ambiguities in sequence read-
ing. In those sequences longer than 400 bp that
were read, most errors occurred late in the sequence
and were often present as extra bases in a run of
two or more of the same nucleotide. The remainder
of the errors were due to secondary structure in
the template DNA. However, because the complete
sequence was analysed with an average 5.9-fold
redundancy and most of it on both strands, the final
error frequency is estimated to be less than 0.1%. In
comparison, 35 different European laboratories were
engaged in sequencing the Saccharomyces cerevisiae
genome, with the attendant possibility of a very high
error frequency. However, by using a DNA coordin-
ator who implemented quality-control procedures
(Table 7.1), the overall sequence accuracy for yeast
chromosome XI (666 448 bp) was estimated to
be 99.97% (Dujon et al. 1994), i.e. similar to that
noted above. Lipshutz et al. (1994) have described a
software program for estimating DNA sequence
confidence. Fabret et al. (1995) have analysed the
errors in finished sequences. They took advantage of
the fact that the surfactin operon of Bacillus subtilis
had been sequenced by three independent groups.
This enabled the actual error rate to be calculated. It
was found to range from 0.02 to 0.27%, the different
error rates being ascribed to the detailed sequenc-
ing tactics used. Other studies of DNA sequencing
Search WWH ::




Custom Search