Biology Reference
In-Depth Information
simultaneously solve both problems is the combination of several search
strategies and identification tools in so-called identification “work-
flows”, which should combine various tools and search strategies to
improve MS/MS data analysis by increasing the number of confidently
identified peptides. 46,47
Up to now, very few platforms dedicated to proteomic data process-
ing have been implemented. These platforms aim to automate the iden-
tification process so as to reduce data analysis time and to enhance the
quality of identification as well as the coverage of matched spectra. The
Trans-Proteomic Pipeline (TPP) 46 is an open source platform comprising
the suite of tools for MS/MS analysis pointed out in Sec. 2.2. This
pipeline allows importing output files from Sequest and comprises various
modules, mainly for postprocessing, including result validation, quantifica-
tion of isotopically labeled samples, and the Pep3D tool for viewing raw
LC/MS data and results at the peptide and protein levels. MASPEC-
TRAS 48 also includes several identification tools through importing result
files that are input in further processing. The TOPP package, 49 a frame-
work for processing simple MS/MS data, should also be mentioned as its
parts can be connected to a pipeline by self-written scripts. In all cases,
TPP and MASPECTRAS support a fixed set of identification programs
implementing the same search strategy.
Several commercial platforms are also available. For example, Scaffold
analyzes Mascot and Sequest results and validates hits by cross-correlation
with X!Tandem, filters out uninteresting spectra, and exports high-quality
unidentified ones for future analysis. ProteinScape 50 and ProteinLynx
Global SERVER also adopt a stepwise approach to proteome study.
The swissPIT 51 platform was designed by the PIG as a workflow-
oriented toolbox giving access to a number of MS/MS analysis software
tools. The first identification workflow was created using the JOpera
engine. 52 Currently, four identification tools (Phenyx, Popitam,
X!Tandem, and InsPecT 44 ) and two protein datasets (UniProtKB/Swiss-
Prot and UniProtKB/TrEMBL) are being tested in the pipeline. The
choice of these first four algorithms has been motivated by their popu-
larity and known efficiency through the implementation of various para-
meterized search strategies, as well as the authors' access to the source
code. The computing resources that currently serve the purposes of the
Search WWH ::




Custom Search