Biology Reference
In-Depth Information
SALIENT CHARACTERISTICS OF THE GENOME
AND TRANSCRIPTOMES OF
: TOWARD
UNDERSTANDING THE MOLECULAR LANDSCAPE
OF THE PARASITE AND THE DESIGN OF
NEW INTERVENTIONS
A. SUUM
Characterizing the Genome
The A. suum genome was sequenced at 82-fold coverage, producing
a final draft assembly of 272,782,664 bp (N50
¼
407 kb; N90
¼
80 kb; 1618
contigs of
2 kb) ( Table 11.1 ) with a mean GC-content of 37.9%. Notably,
repetitive sequence (which was identified using the program Tandem
Repeats Finder [TRF] 44 ) in the A. suum assembly is remarkably low
(
>
4.4% of the total assembly) relative to that reported for other metazoan
genomes sequenced to date, 23,32,45 including those sequenced employing
the same approach as used for A. suum . 16 There were various possible
explanations for this low repeat content. First, it was possible that the
assembly of the repeat content was poor and, thus, the genomic assembly
was not reflective of the true repetitive content of the genome. To assess
this possibility, we mapped all of our raw genomic sequence data (i.e.
reads) to the final assembly (using the program BWA 46 ) and assessed the
depth of coverage achieved for repetitive regions of the genome relative to
non-repetitive flanking regions (500 bp regions on either side of each
repeat). If significantly more repetitive content was present in the raw
data relative to the assembly, we would anticipate the repetitive regions to
be covered much more deeply than their flanking regions, as has been
demonstrated for genes of variable copy number in other studies. 47 For
the current A. suum assembly, no such difference in coverage was detec-
ted; indeed, the mean coverage of the repetitive regions in the genome
(
w
68-fold) was slightly lower than the non-repetitive flanking regions
(76-fold coverage). A second possibility was that the repetitive sequence
data were present in the assembly but not identified using Tandem Repeat
Finder. To assess this possibility, we explored the repeat content of the
genome using the programs RepeatMasker, 48 LTR_FINDER, 49 PILER, 50
and RepeatScout. 51 Although no additional repeat content was detected,
this approach did allow us to identify transposable elements encoded
within the repetitive sequence data found. Indeed,
w
75% (i.e. 3.2% of the
total assembly) of these repetitive sequences represented at least
22 families (8 LTR, 12 LINE, and 2 SINE) of retro-transposons and eight
families of DNA transposons (91 distinct sequences in total). This richness
of transposable element families is comparable with that predicted
for other genomes of parasitic helminths, 25,52,53 suggesting a third
explanation for the low repeat content in the A. suum genome.
w
Search WWH ::




Custom Search