Biomedical Engineering Reference
In-Depth Information
engines that use the AMT tag approach to identify peptides ( 23 ,
24 ) ( Table 4.3 ) .
4.4.Spectral
Matching
The searching of unknown MS spectra against databases of spec-
tra from known compounds has been used for a long time in
all forms of analytical spectroscopy including mass spectrometry.
Until recently there were no large databases of identified and
annotated peptides available. As the size of the publicly avail-
able annotated databases has increased, so has the feasibility of
spectral searching. Spectral matching software evaluates the over-
lap between observed and predicted fragments in the spectra
and scores them using probability functions or spectral similar-
ity metrics ( Table 4.3 ) . Spectral matching is considerably faster
than uninterpreted MS/MS searching but is normally reliant on
the peptides of interest being present in the annotated database.
Peptides from low abundance proteins, rarely seen alternative
splice forms or weakly fragmented under MS/MS conditions, will
be poorly represented. An alternative approach is to populate a
database with simulated spectra of all the theoretical peptides for
aproteome( 25 ) .
Machine-learning algorithms, which improve their performance
over time, can also be used to analyze data or improve results
( Table 4.3 ) . The algorithms are classified into different types
depending on their design. The most commonly used machine-
learning algorithms in proteomics are supervised learning algo-
rithms, which learn from a training set of manually curated
examples, and semi-supervised learning algorithms, which learn
from both manually curated examples and uninterpreted exam-
ples. A support vector machine, Gist (a supervised learning algo-
rithm), was used to discriminate between positive and negative
identifications by SEQUEST using the scoring systems reported
by SEQUEST and other calculated factors ( 26 ) , while a semi-
supervised learning algorithm, Percolator, was used to discrimi-
nate between correct and decoy spectrum identifications ( 27 , 28 ) .
4.5.
Machine-Learning
Algorithms
5. Visualization
of LC-MS/MS
Data, Validating,
Comparing, and
Reporting Results
The software applications described in this section are all multi-
functional but can be grouped around the general tasks of visual-
ization, validation, comparing, and reporting results ( Table 4.4 ) .
The applications interact with the raw LC-MS/MS data and
the peptide search results generated by specialized algorithms
described in Section 4 .
Search WWH ::




Custom Search