Graph-based Approaches for Motif Discovery - Clustering Challenges in Biological Network

Biology Reference

In-Depth Information

finding problem. Non-integral solutions, on the other hand, are not feasible for the

ILP and do not translate to a selection of positions for the motif finding problem.

Those instances need to be solved by other means, such as using an ILP solver.

Zaslavsky and Singh [47], who proposed this ILP formulation, couple the linear

programming heuristic with a suite of graph pruning techniques, which enable a

drastic reduction in graph sizes, and lead to effective and efficient application of

the LP/ILP solvers to much smaller graphs.

4.3.1.1. Graph Pruning

The authors [47] introduce number of successively more powerful optimality-

preserving dead-end elimination (DEE) techniques for pruning graphs corre-

sponding to motif finding problems. The basic idea is to discard vertices and/or

edges that cannot possibly be part of the optimal solution.

Here we mention the simplest of the techniques. Suppose there exists an N -

clique of weight C ∗ in G .Thenavertex u , whose participation in any possible

N -clique in G reduces the weight of that clique below C ∗ , is incompatible with

the optimal alignment and can be safely eliminated.

More formally, for vertex u

V i define star ( u ) to be a selection of vertices

from every graph part other than V i .Let F u be the value induced by the edge

weights for the set of vertices in the star ( u ) that form best pairwise alignments

with u :

∈

F u =

j = i

max

v∈V j

w uv

(4.1)

If u were to participate in any N -clique in G , F u is the maximum it can contribute

to the weight of the clique.

Similarly, let F i be the value of the best possible

star ( u ) among all u

∈

V i :

F i =max

u∈V i

F u

(4.2)

F i is an upper bound on what any vertex in V i can contribute to any alignment.

If F z ,themostavertex z

V k can contribute to a clique, assuming the best

possible contributions from all other graph parts, is insufficient compared to the

value C ∗ of an existing clique, i.e. if

∈

C ∗ −

F i ,

F z < 2

×

(4.3)

i = k

z can be discarded. The clique value C ∗ is used with a factor of 2 since two edges

are accounted for between every pair of graph parts in the above inequality.

Clustering Challenges in Biological Network

Search WWH ::

Custom Search

Home