Database Reference
In-Depth Information
Using these definitions, Pandey et al. 57 proposed the following graph trans-
formation approach for purifying available interaction datasets. Consider the
input interaction network G
, where V is the set of nodes represent-
ing the proteins in the network, and E is the set of edges representing the
protein-protein interactions constituting the network. First, the h - confidence
measure is computed between each pair of constituent proteins, whether con-
nected or unconnected by an edge in the input network. Next, a threshold
is applied to drop the protein pairs with a low h - confidence to remove spuri-
ous interactions and control the density of the network. The resultant graph
G = (
= (
V
,
E
)
E )
is hypothesized to be the less noisy and more complete version of
G , since it is expected to contain fewer noisy edges, some biologically viable
edges that were not present in the original graph, and more accurate weights
on the remaining edges.
In order to evaluate the ecacy of the resultant networks for protein func-
tion prediction, we provided the original and the transformed graphs as input
to the FunctionalFlow algorithm. 59 FunctionalFlow is a graph-theory-based
algorithm that enables insuciently connected proteins to obtain functional
annotations from distant proteins in the network and has produced much
better results than several other function prediction algorithms operating on
protein interaction networks. We also tested several transformed versions of
the input network generated using our graph transformation approach in con-
junction with some other common neighbor-based similarity measures, such
as the number of common neighbors, and Samanta et al.'s p-value measure. 55
Figure 8.4 shows the performance of the FunctionalFlow algorithm on these
transformed versions of two standard interaction networks, measured in terms
of the accuracy of the top scoring 1,000 predictions of the functions of the con-
stituent proteins.
The significant improvement in the accuracy of the predictions derived
from the h - confidence -based transformations of standard interaction networks,
one of which is constructed by combining several popular yeast interaction
datasets (combined) and weighted using the EPR index tool, and the other
being a confident subset of the DIP database 35 (DIPCore), shows that this
association analysis-based graph transformation approach is indeed able to
reduce noise, enhance completeness, and assign more reliable weights to the
constituent edges. The other similarity measures were also substantially out-
performed by h - confidence . This result is in coherence with those of an earlier
study, where h - confidence and hypercliques were used to eliminate noisy ob-
jects from datasets. 60
V
,
8.4.2 Future Directions
The above discussion shows that the preprocessing of biological data can en-
hance the performance of standard function prediction algorithms substan-
tially, and thus should be considered as an integral step of the process of
Search WWH ::




Custom Search