Information Technology Reference
In-Depth Information
From this ontology we selected all the GO terms which annotate from 20
up to 3'000 human proteins. Note that we discarded all the proteins that were
associated to the GO terms by electronic annotation, because we do not want
to rely on GO annotations that have been predicted on the basis of features
that are part of our input vectors (for example InterPro domains). Finally, we
obtained 1'930 GO terms representing output targets for the supervised models
to be trained. Since a protein can have more than a function, an input vector
can be associated to more than a GO term.
Table 1. Input features
Protein Feature
Internet Location
InterPro domains
www.ebi.ac.uk/interpro/
PROSITE patterns/profiles
expasy.org/prosite/
Coiled coil domains
www.ch.embnet.org/software/COILS form.html
Transmembrane domains
In-house program following [5]
Transmembrane domains
saier-144-21.ucsd/memsat.html
Transmembrane domains
www.cbs.dtu.dk/services/TMHMM/
GPI lipid anchors
mendel.imp.ac.at/gpi/gpi server.html
N-Glycosylation sites
www.cbs.dtu.dk/services/NetNGlyc/
O-glycosylation sites
www.cbs.dtu.dk/services/NetOGlyc/
N-terminal myristoylation
mendel.imp.ac.at/myristate/SUPLpredictor.htm
N-terminal myristoylation
expasy.org/tools/myristoylator/
Presence of poly-aminoacids
In-house program
Sequence repeats
andrade/papers/rep/search.html
Signal peptide cleavage sites www.cbs.dtu.dk/services/SignalP/
Tyrosine sulfation sites expasy.org/tools/sulfinator/
Subcellular location and cleavage site www.cbs.dtu.dk/services/TargetP/
Disordered regions anchor.enzim.hu/
Interaction with other human proteins string-db.org/
www.embl.de/
4 Experiments
In the first series of experiments, we took into account 1930 distinct classifica-
tion problems, for which positive examples represent proteins annotated by a
particular GO term, while the negative examples are selected outside the branch
of its associated GO term. This classification strategy corresponds to the “ one-
versus-all ” approach. We performed both cross-validation trials and tests on the
independent testing set.
 
Search WWH ::




Custom Search