Biology Reference
In-Depth Information
subcellular locations;
protein/protein interactions; and
expressions — tissue specificity and developmental stage.
To populate Swiss-Prot entries with this type of information, we
use published experimental reports. This is by far the bulk of the
manual functional annotation process. It necessitates the full
scientific expertise of the annotators to ensure the quality of the
textual representation of the role and function of the protein;
use and actively request information provided by the scientists who
have carried out functional characterization studies;
use prediction tools to infer information from the full set of
topological, domain, and PTM information; and
use sequence comparison tools to infer information from orthologs
and, when relevant, paralogs.
The UniProtKB keywords are useful in providing a very high-level
summary of some of the information displayed in a UniProtKB entry. We
maintain a keyword controlled vocabulary and ensure that keywords are
consistently used in the relevant entries. This is done partially in a pro-
grammatic manner, but also requires annotator judgment skills.
Unfortunately, there is still a large set of proteins that do not have
any homologs and that lack InterPro domain predictions. For these pro-
teins, inferring a function is a challenging, if not impossible, task. Those
so-called “ORFan” proteins are semiautomatically annotated using
Anabelle, our sequence analysis workbench. The result of this process is
generally restricted to the annotation of predicted topological informa-
tion (signal sequences, transmembrane domains) as well as the annota-
tion of regions of compositional biases (coiled-coil domains, runs of
particular amino acids).
2.3. Why are DictyBase and UniProtkb
Complementary?
dictyBase, as with other MODs, is genome-centric. In addition to
the complete genome sequence, groups performing high-throughput
Search WWH ::




Custom Search