Biology Reference
In-Depth Information
Information on PTMs can be obtained by
•
using results of published low- or high-throughput proteomics
studies;
•
using some specific high-quality prediction tools (e.g. for signal
sequence, transit peptide, N-glycosylation, etc.); or
•
propagation from already annotated orthologous proteins. This
process must be carried out with the utmost care, since it is impor-
tant to avoid propagating species- or phylum-specific PTMs outside
of their realm.
Figure 4 provides an example of entry with a predicted PTM feature.
It is also of paramount importance to correctly and fully represent
the domain structure of proteins as well as to report relevant sites and
motifs.
•
Annotation of topological domains (transmembrane, extracellular
regions, etc.) is made on the basis of
published experimental topological data;
transmembrane prediction tools;
results of some PTM prediction tools that offer insight into the
topology of the protein, such as GPI-anchor prediction, signal
or transit peptide prediction, etc.; or
similarity to close orthologs, complemented by a high-level
manual check to carefully estimate the specific biological context
of some topological information.
•
Annotation of specific validated domains and important sites
(active sites, metal-binding sites, etc.) is made on the basis of
information derived from three-dimensional (3D) structures
through software-assisted data mining of the relevant Protein
Data Bank (PDB)
9
entries;
InterPro
10
and ProRule,
11
a domain annotation rule system that
we developed to help in the annotation of important sequence