Agriculture Reference
In-Depth Information
Whole-genome shotgun sequencing
Whole-genome shotgun sequencing, which in-
volves the sequencing of random genomic clones,
has been successfully used for sequencing small
genomes. It is more cost-effective for sequencing
small genomes than hierarchical sequencing,
because no physical map is required. For large
genomes, however, the physical map construction
represents less than 5% of the sequencing cost
and so the fi nancial gain is not high. Furthermore,
assembling an entire genome, some 16,800 Mb in
the case of hexaploid wheat, compared with
assembling a few hundred kilobases at the most
when assembly is done BAC by BAC is, to say the
least, problematic. In rice, several WGS sequences
were produced in addition to the hierarchical
sequence of the rice genome (Goff et al., 2002; Yu
et al., 2002, 2005). The latest assemblies of both
the indica and japonica genomes, sequenced at 6X,
have an N50 value of around 23 kb for contigs and
30 kb for scaffolds. The N50 value is the size at
which 50% of all base pairs are incorporated in
contigs with this minimal length. By combining
information from both the 6X indica and japonica
WGS assemblies, these contigs were further
assembled into superscaffolds of 8 to 10 Mb
(Yu et al., 2005). Whole-genome shotgun assem-
blies can also be improved by incorporating infor-
mation from BAC-end sequences and BAC
lengths obtained from fi ngerprint data (Warren
et al., 2006). The largest genome that has
been sequenced to date by the WGS approach
is around 3,500 Mb (http://www.ncbi.nlm.nih.
gov/genomes/leuks.cgi). Applying WGS sequenc-
ing to wheat and assembling the sequence, at least
for the low-copy regions of the genome, would
be feasible. The cost, although cheaper than
BAC-by-BAC sequencing, would still be con-
siderable. A 6X shotgun Sanger sequence of the
hexaploid wheat genome would require some
120 million reads and carry a price tag of around
$60 million.
“Physical mapping in hexaploid wheat”) are
being constructed, and sequencing the wheat
genome BAC by BAC thus remains an option.
Other potential strategies are to limit sequenc-
ing to the gene space using gene-enrichment
techniques (Rabinowicz et al., 1999; Yuan et al.,
2003) or by judiciously choosing BAC clones
that are likely to contain genes as has been
done for Lotus japonicus and Medicago truncatula
(Young et al., 2005). The new sequencing
technologies also provided new opportunities
as well as challenges for sequencing the wheat
genome. The advantages and disadvantages
of different sequencing strategies, and the
costs and benefi ts of applying these strategies
to the wheat genome, are discussed in subsequent
sections.
Sanger sequencing
Hierarchical genome sequencing
Sequencing a genome BAC by BAC requires the
availability of an anchored physical map. After
establishment of a MTP, each BAC in the MTP
is sequenced and assembled separately. The
weakest link in this approach is the physical
map, and the level of genome coverage provided
by the physical map is the major factor that will
determine the completeness of the sequence. The
critical limiting factor in applying this approach
to the large wheat genome is the cost. Finger-
printing hexaploid wheat BAC libraries at 15X
coverage will cost roughly $5 million. Sequenc-
ing a MTP of some 210,000 BAC clones at
8X redundancy without fi nishing and using
Sanger sequencing will cost around $125 million
(calculated at $0.50/read). Construction of
chromosome-specifi c BAC libraries allows the
burden of physical mapping and sequencing of
the hexaploid wheat genome to be spread inter-
nationally over different laboratories. While the
BAC-by-BAC approach would undoubtedly yield
the most complete sequence, a price tag of
$6 million for sequencing an individual chromo-
some may be too high for many laboratories to
participate.
Sequencing of gene-rich BAC clones
Sequencing of gene-rich BAC clones relies on the
assumption that genes are clustered in gene-rich
Search WWH ::




Custom Search