Biology Reference
In-Depth Information
7
Genome Annotation: Shotgun Proteomics Complements Shotgun Genomics
The number of novel sequenced genomes and new genome
projects—both prokaryotic, eukaryotic as well as the “metage-
nome” of communities of organisms—is exponentially increasing.
The high amount of data generated displays the enormous
challenge for bioinformatics. A classical approach is the pure com-
puter-assisted annotation of predicted Open Reading Frames
(ORF). Newly developed techniques consider experimental data
such as proteomic-based data but also metabolomic data for
improved genome annotation [ 20 ]. A high-throughput method
for high protein identifi cation rate is “shotgun proteomics”-
analysis described above. Along the principles of shotgun genom-
ics technology, proteins can be reconstructed from tryptic
peptides stemming from a digest of whole proteomes. Shotgun
proteomics is characterized by a very high protein identifi cation
rate and generates huge qualitative proteome catalogues from
model organisms. In a recent publication we used shotgun pro-
teomics data for a projection of all identifi ed proteins into a
functional genome annotation and subsequent metabolic recon-
struction of the unicellular green algae Chlamydomonas rein-
hardtii , a recently sequenced model organism for photosynthesis
and CO 2 -neutral biomass production also called the “green
yeast” [ 20 ]. This way, predicted gene models can be confi rmed
by proteomic data [ 20 ]. Furthermore, many protein data that
are not predicted by existing gene models may point to new
gene models not detectable by computer-based in silico analyses
only. Recently, we performed a comprehensive shotgun pro-
teomics analysis of various growth conditions of Chlamydomonas
reinhardtii , integrated all the proteomics raw data and searched
this dataset against different Chlamydomonas genome annota-
tion databases of the last 5 years [ 21 ]. The result is that the use
of different genome annotation databases of the same organism
has a strong impact on the functional interpretation of the pro-
teomics data as well as on quantitative proteomics [ 21 ]. Therefore
it is highly recommended that databases are integrated or
6-frame-translations are used for peptide identifi cation.
Furthermore, the differences between the database searches
point to critical genomic regions and single genes for gene func-
tion analysis and annotation problems [ 21 ] .
8
Conclusion
The presented proteomics and data integration platform is applicable
to many different research fi elds such as proteomic investigations,
metabolic modeling, proteogenomics, and phosphoproteomics.
 
Search WWH ::




Custom Search