Database Reference
In-Depth Information
h ( x 1 ,x 2 )=max
i
p ( y ( x 1 )= i
|
·
p ( y ( x 2 )= j
|
Θ) ,
Θ)
max
j
(7.14)
where y i is the label assignment for point x i .
The DistBoost algorithm computes the weights of the “weak” distance func-
tions using Boosting, and updates the weights on pairs of instances, which are
translated to weights on individual data instances. This is again passed back
to the input of the GMM-EM algorithm, and the process is repeated for mul-
tiple steps.
7.6 Satisfying Constraints and Learning Distance Func-
tions
As mentioned in Section 7.2, there have been some algorithms that try to
both enforce constraints and learn distance functions from constraints for par-
titional clustering algorithms, e.g., KMeans. In this section we will outline one
such algorithm, which uses the framework of a generative probabilistic model,
the Hidden Markov Random Field (HMRF) (6). It can cluster text docu-
ments using either cosine distance or Euclidean distance on L 2 -normalized
input data, doing both constraint satisfaction and distance learning in the
process.
7.6.1 Hidden Markov Random Field (HMRF) Model
The Hidden Markov Random Field (HMRF) is a probabilistic generative
model for semi-supervised constrained clustering, consisting of the following
components: (1) an observable set X =( x 1 ,...,x n ) of random variables,
corresponding to the given data instances X ; (2) an unobservable (hidden)
set Y =( y 1 ,...,y n ) of random variables, corresponding to cluster assign-
ments of instances in X , y i
(1 ,...,K ); (3) an unobservable (hidden) set of
generative model parameters Θ, which consists of distance measure param-
eters A (typically a matrix or vector of weights) and cluster representatives
M =( μ 1 ,...,μ K ): Θ =
; (4) an observable set of constraint variables
C =( c 12 ,c 13 ,...,c n− 1 ,n ). Each c ij is a tertiary variable taking on a value
from the set (
{
A, M
}
1 , 0 , 1), where c ij = 1 indicates that ( x i ,x j )
C ML , c ij =
1
indicates that ( x i ,x j )
C CL ,and c ij = 0 corresponds to pairs ( x i ,x j )that
are not constrained. The constraints are accompanied by associated violation
costs W ,where w ij represents the cost of violating the constraint between in-
stances x i and x j if such a constraint exists. Fig. 7.8 shows a simple example
of an HMRF having five data instances partitioned into three clusters, while
maximally respecting three pairwise constraints.
The joint probability of X , Y ,andΘ,given C , in the described HMRF
 
Search WWH ::




Custom Search