Biology Reference
In-Depth Information
a known protein complex and a candidate protein member
of that complex, thus predicting new members of partially
defined protein complexes [133,134] .
The topology of interactome networks can be exploited
to predict novel protein interactions. Network motifs are
patterns of interconnection involving more than two nodes
[135,136] . Triangles
membership in modules identified from the interactome
network are similar [141] . Protein interactions, mostly
transferred by orthology from human, allow the most
precise predictions of gene function in mouse genes among
several data types [159] .
Topological modules in interactome networks most
likely correspond to specific biological processes or func-
tions [160] . It follows that identifying modules containing
genes/proteins of both known and unknown function can
help assign function to uncharacterized genes/proteins.
Combining topological modules with other genomic or
proteomic data in multicolor networks brings about more
biologically coherent units [128,161
and
larger
densely
connected
subgraphs are more frequent
protein
interactome networks than would be expected by chance, in
turn rendering candidate protein interactions that complete
many new triangles in the network more likely to be true
interactions [137,138] . Accordingly, identification of
densely connected subgraphs in an interactome network
can help identify protein complexes [108,139
in most protein
e
165] .
That protein interactions mediate protein functions, and
that protein interactions tend to connect genes/proteins with
related phenotypes just as they tend to connect genes/
proteins with related functions, suggests that protein
interactions can be used to predict new disease genes.
Mutations causative for ataxia, a neurodegenerative
disorder, affect proteins that share interacting partners. A
subset of these shared partners have been found to be
associated with neurodegeneration in animal models
[59,166] . The ability of interactome maps to highlight new
candidate disease genes and disease modifier genes had
been anticipated in the early large-scale binary interactome
maps [60,61] . With large-scale interactome maps available
for human, various computational efforts systematically
prioritize potential human disease genes based on the
patterns of protein interactions [93,167
e
142] . Other
methods exploiting network topology parameters have
proved useful in predicting physical interaction networks of
other organisms [54] .
A promising strategy for interaction prediction is to
produce multicolor network motifs derived by integrating
protein
e
protein interactome networks with genetic inter-
action networks or phenotypic profile similarity networks
(see Chapters 5 and 6). Protein interactions tend to connect
genes with related phenotypes, as was first discerned in
small-scale studies of protein interactions, and later
demonstrated for
e
large-scale
interactome maps
and
systematic phenotyping data [50,143
147] . It follows that
genetic interaction profiles can be used to predict protein
complexes [18,143,146,148
e
150] . On the basis of the
strength of the physical and genetic interactions of
a particular protein pair it is possible to assess the likeli-
hood of that protein pair operating either within a protein
complex or connecting two functionally related protein
complexes [151] . The predictive power of these integrative
approaches lies in the systems organization through which
interactomes underlie genotype
e
171] . Current
efforts to predict human gene function and disease
phenotype are now striving to combine several orthogonal
large-scale genomic and proteomic data types [172,173] .
Progress in this area will depend increasingly on efforts
to establish benchmark data to allow rigorous comparisons
among the evolving methods. Benchmarking has happened
preliminarily for gene function prediction [159] .For
prediction of phenotype or disease most current methods
rely upon a handful of known 'training examples'
e
phenotype relationships.
e
small
sets of genes known to be associated with the phenotype or
disease. This strategy has its worth, but eventually methods
that can predict disease genes in the absence of known
examples will also be needed. Genome-wide association
(GWA) studies provide an emerging example where
predictions can be attempted without training examples.
Although GWA studies serve to identify a genomic locus
associated with a disease, they often cannot pinpoint which
single gene of several or many resident within the locus is
the actual disease gene. Where multiple GWA loci are
linked to a single disorder or trait, the subset of genes that
exhibit between-locus protein interactions, or other types of
biological relationships, may be the most likely to be the
causal disease genes [168,169,174] .
It may be possible to predict phenotypes and prioritize
disease genes based on local network topology alone. One
e
Predicting Gene Functions, Phenotypes
and Disease Associations
Early in the implementation of Y2H, Raf kinase was
imputed as an oncogene based on its specific interaction
with H-Ras [152] . This type of reasoning is behind the
principles of 'guilt-by-association' and 'guilt-by-profiling',
whereby a functional annotation can be transferred from
one gene/protein to another 'across' biological relation-
ships, or
'across' profiles of biological
relationships
[153
157] . Function prediction methods based on these
principles either make assumptions about the independence
of evidence types, or model the interdependencies between
edge types [158] . Protein A, of unknown function, can be
said with some likelihood to be involved in the same bio-
logical process as protein B, of known function, if A and B
belong to the same protein complex, or if their profile of
e
Search WWH ::




Custom Search