Agriculture Reference
In-Depth Information
also a critical parameter, and given the need to encompass the bacterial diversity within a fish
gut many thousands of sequences should be generated per sample. While a GS FLX system
will generate up to 1 million reads per full plate, a Genome Analyzer IIx can generate 40
million single reads per lane, with a total of 320 million in total for all eight lanes. The use
of paired-end Illumina sequencing for 16S rRNA surveys retains the advantages of the larger
number of reads that can be obtained with this technology, while including a quality control
assembly step that minimizes sequencing errors (Bartram et al. 2011).
Depending on the number of samples and the desired sequencing coverage per sample,
different sequencing technologies may therefore be appropriate. Since the error rate for NGS is
higher than that for Sanger sequencing (Zagordi etal. 2010), trimming sequences for sufficient
quality will reduce the number of useful sequences for phylogenetic analysis and this should
be taken into account when planning the extent of sequencing coverage. Lastly, the ability
to include multiple samples per run is important in the estimation of sequencing coverage per
sample. In the case of 16S rRNA gene amplicon sequencing, unique 6 to 10 bp 'bar-codes' can
be incorporated at the 5 end of a primer, permitting post-sequencing segregation of sequences
into specific sample sets (Hamady et al. 2008). This approach does require separate ordering
of multiple unique primers, although this one-time purchase will be mitigated by the longer
term utility of these primer sets for multiple experiments.
From a practical consideration of NGS adoption, the higher costs of NGS as compared to
DGGE or other analyses should be considered. The per sample cost of NGS can be reduced
by analysing multiple samples simultaneously, by reducing the number of sequences gener-
ated per sample, and/or by using a smaller region of a sequencing plate (i.e. one-quarter of a
454 pyrosequencing plate or one lane of an Illumina sequencer). The cost of sequencing per
base pair has dropped precipitously and this trend is expected to continue, so that by com-
bining cost-saving measures with massively parallel sequencing it is expected that bacterial
community analyses will be increasingly conducted using NGS approaches. With cost-saving
measures care must be taken to avoid reducing sequencing coverage to an extent that results in
an inadequate survey. If possible, a preliminary survey may be useful in assessing the sequenc-
ing coverage needed, combined with generation of a rarefaction curve to establish the degree
of sequencing redundancy achieved.
Revealing the more complete picture of intestinal microbial diversity via high-throughput
sequencing has the consequence of producing a massive amount of data that poses a signif-
icant bioinformatics challenge. Fortunately, tools are available to help process and interpret
this wealth of sequencing data. The first bioinformatics step is to trim the raw sequence data
to remove errors. As mentioned previously, a paired-end strategy may be adopted wherein
each sequence is assembled with its mate pair and any ambiguous sequences are discarded,
resulting in higher quality and longer length 16S rRNA gene sequences (Bartram et al. 2011).
Another strategy for removing errors associated with 454 pyrosequencing data is to use the
algorithm PyroNoise, which results in a more accurate reflection of OTUs (Quince et al.
2009). No matter the sequencing technology, very rigorous trimming should be conducted
to achieve a conservative estimate of bacterial diversity to avoid artifactually high estimates of
bacterial richness (Huse et al. 2007; Reeder and Knight 2010). Sequences trimmed for quality
may be rapidly compared to the database of 16S rRNA gene sequences using bioinformatics
tools available at the Ribosomal Database Project (Cole et al. 2008). Each 16S rRNA gene
sequence may be classified according to its closest phylogenetic affiliation using the RDP
classifier (Wang et al. 2007), allowing the calculation of the relative abundance of bacterial
Search WWH ::




Custom Search