Biology Reference
In-Depth Information
The gradient of d ( x , c )= ( x c ) ,Q ( x c ) 1 / 2 with respect to c is
c ( x c ) ,Q ( x c ) 1 / 2 =
Proof.
)
( x c ) ,Q ( x c ) 1 / 2
Q (
x c
(2.27)
Q ( x c )
d ( x , c ) ,
=
assuming x
= c . Therefore if c 1 , c 2 do not coincide with any of the data points
x i ,wehave
N
p k ( x i ) 2
q k
( x i c k )
d k ( x i , c k )
c k f ( c 1 , c 2 )=
Q k
.
(2.28)
i =1
Setting the gradient equal to zero, “cancelling” the matrix Q k and the common
factor q k , and summing like terms, we get
N
x i = N
p k ( x i ) 2
d k ( x i , c k )
p k ( x i ) 2
d k ( x i , c k )
c k ,
i =1
i =1
proving (2.24) and (2.26). Substituting (2.7) in (2.26) then gives (2.25).
Note : The theorem holds also if a center coincides with a data point, if we interpret
as 1 in (2.24).
Theorem 2.2 applies, in particular, to the Mahalanobis distance (2.4)
d ( x , c k )= ( x c k ) T Σ k ( x c k ) ,
where Σ k is the (given or computed) covariance matrix of the cluster
/
C k .
For the general case of K clusters it is convenient to use the probabilistic form
(2.26).
Corollary 2.1. Consider a function of K centers
d k ( x i , c k ) p k ( x i ) 2
q k
,
K
N
f ( c 1 , c 2 ,..., c K )=
(2.29)
k =1
i =1
an analog of (2.21). Then, under the hypotheses of Theorem 2.2, the minimizers
of f are
u k ( x i )
N
N
x i , with u k ( x i )= p k ( x i ) 2
c k =
d k ( x i , c k ) ,
(2.30)
t =1 u k ( x t )
i =1
for k =1 ,...,K.
Proof.
Same as the proof of Theorem 2.2.
Search WWH ::




Custom Search