Environmental Engineering Reference
In-Depth Information
'marsh' OR 'fen') AND 'soil' AND '16S' at November 11, 2012. Non-16S rRNA sequences
from GenBank were removed by checking the name of sequences. All 16S rRNA gene
sequences from two databases were merged. Duplicate sequences identified based on
Accession Numbers were removed. Mallared was used for checking sequences with vector
nucleotides or chimera (http://www.cf.ac.uk/biosi/ research/biosoft/). The 16S rRNA gene
sequences of Escherichia coli (accession number: U00096) and Methanothermobacter
thermoautotrophicus (accession number: AE000666) were selected as reference sequences
for bacteria and archaea, respectively. In order to avoid uncertainties in comparing and
classifying short sequences, sequences shorter than 250 bp were removed from the dataset
which have few or no sequence overlap. The remaining sequences comprised the redacted
composite dataset used in this work.
2.2. Phylogenetic Analysis
Sequences were aligned with Kalign [36] and classified into taxonomic ranks using the
RDP Classifier with default settings [37]. Based on the output classifications from the RDP
Classier, treemaps were constructed using the treemap packages in R. The dataset was
divided into the following groups based on the classifications: Archaea, Bacteria,
Proteobacteria, Actinobacteria, Firmicutes, Acidobacteria, Bacteroidetes, Chloroflexi, and the
collected ―minor phyla‖ of bacteria that comprised sequences not assigned to any of the
mentioned phyla. Distance matrices of aligned sequences were computed within ARB using
Jukes-Cantor correction [38]. Individual distance matrices were analyzed using Mothur [39]
to cluster OTUs, generated rarefaction curves, and estimated the expected maximum species
richness complementary to the ACE and Chao1 richness. Unless otherwise stated, the genetic
distance ≤0.03 were used to define species-level OTUs. The distance cut-off for other
taxonomic ranks was set as: 0.05, genus; 0.10, family; 0.15, class/order; 0.2, phylum. All the
estimated asymptotes of the rarefaction curve were determined through R package monomol
(https://github.com/binma/monomol) [40]. The coverage percentages were calculated as
described by Nelson et al. [41].
2.3. Accession Numbers
The Accession Numbers for all sequences analyzed in this study were available from the
corresponding author. The sequences were currently maintained in an in-house ARB database
of 16S rRNA gene sequences for wetlands. A copy of this database and the sequence
alignment were also available by request from the corresponding author.
3. R ESULTS AND D ISCUSSION
This meta-analysis study was conducted ground on publicly available 16S rRNA gene
sequences recovered from wetland soils worldwide. The sequences dataset collected from
Search WWH ::




Custom Search