Bioinformatics for LC-MS/MS-Based Proteomics - LC-MS/MS in Proteomics: Methods and Applications

Biomedical Engineering Reference

In-Depth Information

engines that use the AMT tag approach to identify peptides ( 23 ,

24 ) ( Table 4.3 ) .

4.4.Spectral

Matching

The searching of unknown MS spectra against databases of spec-

tra from known compounds has been used for a long time in

all forms of analytical spectroscopy including mass spectrometry.

Until recently there were no large databases of identified and

annotated peptides available. As the size of the publicly avail-

able annotated databases has increased, so has the feasibility of

spectral searching. Spectral matching software evaluates the over-

lap between observed and predicted fragments in the spectra

and scores them using probability functions or spectral similar-

ity metrics ( Table 4.3 ) . Spectral matching is considerably faster

than uninterpreted MS/MS searching but is normally reliant on

the peptides of interest being present in the annotated database.

Peptides from low abundance proteins, rarely seen alternative

splice forms or weakly fragmented under MS/MS conditions, will

be poorly represented. An alternative approach is to populate a

database with simulated spectra of all the theoretical peptides for

aproteome( 25 ) .

Machine-learning algorithms, which improve their performance

over time, can also be used to analyze data or improve results

( Table 4.3 ) . The algorithms are classified into different types

depending on their design. The most commonly used machine-

learning algorithms in proteomics are supervised learning algo-

rithms, which learn from a training set of manually curated

examples, and semi-supervised learning algorithms, which learn

from both manually curated examples and uninterpreted exam-

ples. A support vector machine, Gist (a supervised learning algo-

rithm), was used to discriminate between positive and negative

identifications by SEQUEST using the scoring systems reported

by SEQUEST and other calculated factors ( 26 ) , while a semi-

supervised learning algorithm, Percolator, was used to discrimi-

nate between correct and decoy spectrum identifications ( 27 , 28 ) .

4.5.

Machine-Learning

Algorithms

5. Visualization

of LC-MS/MS

Data, Validating,

Comparing, and

Reporting Results

The software applications described in this section are all multi-

functional but can be grouped around the general tasks of visual-

ization, validation, comparing, and reporting results ( Table 4.4 ) .

The applications interact with the raw LC-MS/MS data and

the peptide search results generated by specialized algorithms

described in Section 4 .

LC-MS/MS in Proteomics: Methods and Applications

Search WWH ::

Custom Search

Home