Biology Reference
In-Depth Information
networks is simply its visual appeal and usefulness — complex protein
networks can be difficult to browse and understand, so clustering can
help to more easily make out the highly connected “cliques” in such net-
works (Fig. 1).
For the clustering procedure itself, the choice of algorithm depends
not only on the precise question asked, but also on the type of network
at hand. Is it a densely connected or a sparse network? Are the edges
“weighted”, i.e. do they have a strength or confidence value attached to
them? Are the edges directed or undirected, and are there various “types”
of edges? Apart from standard hierarchical clustering algorithms, such as
K -means or single-linkage clustering, a number of specialized algorithms
have been designed (or at least adapted) specifically for cluster analysis
of protein networks. These include Markov clustering (MCL), 26 super-
paramagnetic clustering (SPC), 27 restricted neighborhood search clustering
(RNSC), 28 molecular complex detection (MCODE), 29 and others. When
deciding which clustering method to use and — equally important — at
which parameter settings to use it, it is essential to evaluate the results in
detail and to compare them to a set of trusted functional units that serve
as a reference. For any given dataset, the confidence in the results is highest
when the results tend to show little dependency on parameter settings,
and when they generally give a good overlap with previous expectations
and with the reference data. 30,31
Based on artificial test networks, to which controlled amounts of
noise have been added, the four clustering algorithms mentioned above
have been tested and compared to each other, and were subsequently
also assessed on actual protein interaction data from high-throughput
experiments. This has suggested that MCL and RNSC tend to perform
better, 24 but tests like these of course depend on the exact type of input
data and should be repeated before each application to a new data type.
Fig. 1 ( Continued ) cut-off (solid lines), the two known complexes III and IV are
easily recovered. With the high-stringency cut-off (dotted lines), these are further
subdivided into functional units. COX1, COX2, and COX3, for example, form a sub-
complex together — they constitute the active reaction center of complex IV. Notice
how experimental interaction links are seen only within the complexes, whereas func-
tional connections also extend between the two complexes.
Search WWH ::




Custom Search