Biology Reference
In-Depth Information
gordoni i Challis NCTC7868 (2007, TIGR-CMR [48], F. nucleatum ATCC 25586
(2002, TIGR-CMR [49]), bovine (2005, UC Santa Cruz), nrdb human subset (NCBI,
as provided with Thermo Bioworks ver. 3.3) and the MGC (Mammalian Gene collec-
tion, 2004 curation, NIH-NCI [50]) concatenated with the reversed sequences. After
data processing, the genome sequence for strain 33277 became available [31] and the
data were subsequently cross-referenced to PGN numbers from the 33,277 specific
FASTA database provided by Los Alamos National Laboratory (LANL) (personal
communication with G. Xie). Although Naito et al. [31] reported extensive genome
re-arrangements between W83 and ATCC 33277, the actual protein amino acid se-
quences are sufficiently similar across the proteome that the use of a database based
on W83 was not expected to greatly impact the analysis. Our proteomic methods are
not sensitive to genome re-arrangements, only to changes in amino acid sequence
for a given protein. The reversed sequences were used for purposes of calculating a
peptide level qualitative False discovery rate (FDR) using the published method [51,
52]. The SEQUEST peptide level search results were filtered and grouped by protein
using DTASelect [53], then input into a FileMaker script developed in-house [32, 33]
for further processing. The DTASelect Version 1.9 filter parameters were: peptides
were fully tryptic; ΔCn/Xcorr values for different peptide charge states were 0.08/1.9
for + 1, 0.08/2.0 for + 2, and 0.08/3.3 for +3; all spectra detected for each sequence
were retained (t = 0). Only peptides that were unique to a given ORF were used in the
calculations, ignoring tryptic fragments that were common to more than one ORF or
more than one organism, or both. In practice this had the consequence of reducing our
sampling depth from what we have achieved with single organism studies [27, 32,
33], because the gene sequence overlap among the three organisms is significant. A
bioinformatic analysis (data not shown) of inferred protein sequence overlaps between
P. gingivalis and S. gordonii, or F. nucleatum suggested the reduction in the number
of predicted tryptic fragments unique to P. gingivalis would not be sufficient to impact
the analysis of more than a small number of proteins. The qualitative peptide level
FDR was controlled to approximately 5% for all conditions by selecting a minimum
non-redundant spectral count cut-off number appropriate to the complexity of each
condition, P. gingivalis alone or the P. gingivalis-F. nucleatum-S. gordonii commu-
nity.
Protein Abundance Ratio Calculations
Protein relative abundances were estimated on the basis of summed intensity or spec-
tral count values [27, 32, 33] for proteins meeting the requirements for qualitative
identification described above. Summed intensity refers to the summation of all pro-
cessed parent ion (peptide) intensity measurements (MS 1 ) for which a confirming CID
spectrum (MS 2 ) was acquired according to the DTASelect filter files. For spectral
counts, the redundant numbers of peptides uniquely associated with each ORF were
taken from the DTAselect filter table (t = 0). Spectral counting is a frequency measure-
ment that has been demonstrated in the literature to correlate with protein abundance
[54]. These two ways of estimating protein relative abundance, that avoids the need
for stable isotope labeling, have been discussed in a recent review [27] with specific
reference to microbial systems. To calculate protein abundance ratios, a normalization
 
Search WWH ::




Custom Search