Probabilistic Distance Clustering: Algorithm and Applications - Clustering Challenges in Biological Network - page 34

Biology Reference

In-Depth Information

0.8

0.8

0.5

0.6

0.6

0.4

0.4

0.2

0.2

0.5

0

0

−0.2

−0.2

−0.4

−0.4

0.5

−0.6

−0.6

−0.8

−0.8

1.8

2

2.2

2.4

2.6

2.8

3

3.2

3.4

3.6

3.8

1.8

2

2.2

2.4

2.6

2.8

3

3.2

3.4

3.6

3.8

(a) Level sets of the JDF

(b) Level sets of cluster probabilities

Fig. 2.2.

Results returned by the PDQ algorithm (Algorithm 1 below) for the data of Example 2.1

Let x be a given data point with distances d 1 ( x ) ,d 2 ( x ) to the cluster centers,

and assume the cluster sizes q 1 ,q 2 known. Then the probabilities in (2.7) are the

optimal solutions of the extremal problem

d 1 ( x ) p 1

q 1

+ d 2 ( x ) p 2

q 2

min

(2.12)

s.t. p 1 + p 2 =1

p 1 ,p 2 ≥

0

Indeed, the Lagrangian of this problem is

L ( p 1 ,p 2 ,λ )= d 1 ( x ) p 1

q 1

+ d 2 ( x ) p 2

q 2

+ λ (1

−

p 1 + p 2 )

(2.13)

and zeroing the partials ∂L/∂p i gives the principle (2.5).

Substituting the probabilities (2.7) in (2.13) we get the optimal value of (2.12),

d 1 ( x ) d 2 ( x ) /q 1 q 2

d 1 ( x ) /q 1 + d 2 ( x ) /q 2

L ∗ ( p 1 ( x ) ,p 2 ( x )) =

(2.14)

which is again the JDF (2.10).

The corresponding extremal problem for the data set

D

=

{ x 1 , x 2 ,..., x N }

Next Page

Clustering Challenges in Biological Network

Search WWH ::

Custom Search

Home