Biology Reference
In-Depth Information
using estimates
for the cluster in question.
There are several ways to model the relationship between distances and prob-
abilities [2, 8]. The following assumption is our basic principle.
Principle . For each x ∈D
{ c k , Σ k }
and cluster
C k , the probability p k ( x ) satisfies
p k ( x ) d k ( x )
q k
= constant, say D ( x ), depending on x .
(2.5)
Cluster membership is thus more probable the closer the data point is to the cluster
center and the bigger the cluster.
2.2.1. Probabilities
From the above principle, and the fact that probabilities add to 1 we get
Theorem 2.1. Let the cluster centers
{ c 1 , c 2 ,..., c K }
be given, let x be a data
point, and let
be its distances from the given centers.
Then the membership probabilities of x are
{
d k ( x ): k =1 ,...,K
}
j = k
d j ( x )
q j
p k ( x )=
,k =1 ,...,K.
(2.6)
i =1 j = i
K
d j ( x )
q j
Proof.
Using (2.5) we write for i,k
p i ( x )= p k ( x ) d k ( x )
q k
/ d i ( x )
q i
.
d k ( x ) /q k
d i ( x ) /q i
=1 .
i =1
K
K
Since
p i ( x )=1,
p k ( x )
i =1
j = k
d j ( x ) /q j
1
p k ( x )=
d k ( x ) /q k
d i ( x ) /q i
=
.
i =1
i =1 j = i
K
K
d j ( x ) /q j
In particular, for K=2,
d 2 ( x ) /q 2
d 1 ( x ) /q 1 + d 2 ( x ) /q 2 ,p 2 ( x )=
d 1 ( x ) /q 1
d 1 ( x ) /q 1 + d 2 ( x ) /q 2 ,
p 1 ( x )=
(2.7)
Search WWH ::




Custom Search