Database Reference
In-Depth Information
J
obj
=
x
i
∈X
D
(
x
i
,μ
y
i
)+
c
ij
∈C
v
(
i, j
)
−
log P(Θ) + log
Z
+
n
log
Z
Θ
(7.20)
Basu et al. (6) used Rayleigh priors for P(Θ), and they ignored the nor-
malizer terms. An optimal clustering is obtained by minimizing
J
obj
over
the hidden variables
Y
and parameters Θ, which are comprised of cluster
centroids
M
and distance measure parameters
A
(note that given the cluster
assignments
Y
,themeans
M
=
K
{
μ
i
}
i
=1
are uniquely determined).
7.6.2 EM Algorithm
As discussed in Section 7.6.1, Basu et al. (6) minimize
J
obj
using a K-Means-
type iterative algorithm HMRF-KMeans. The outline of the algorithm is
presented in
Figure 7.10
. The basic idea of HMRF-KMeans is as follows:
the constraints are used to obtain a good initialization of the clustering. Then
in the E-step, given the current cluster representatives, every data instance is
re-assigned to the cluster that minimizes its contribution to
J
obj
.TheE-step
of HMRF-KMeans uses an Iterated Conditional Modes (ICM) approach,
which is a greedy strategy to sequentially update the cluster assignment of
each instance, keeping the assignments for the other instances fixed. In the
M-step, the cluster representatives
M
=(
μ
1
,...,μ
K
) are re-estimated from
the cluster assignments to minimize
J
obj
for the current assignment. The
clustering distance measure is subsequently updated in the M-step to reduce
the objective function by modifying the parameters of the distance measure.
Note that this corresponds to the generalized EM algorithm (38; 18), where
the objective function is reduced but not necessarily minimized in the M-step.
Effectively, the E-step minimizes
J
obj
over cluster assignments
Y
,theM-step
(A) minimizes
J
obj
over cluster representatives
M
, and the M-step (B) reduces
J
obj
over the parameters of the distance measure. The E-step and the M-step
are repeated till a specified convergence criterion is reached. Basu et al. (6)
show that HMRF-KMeans converges to a local optimum of
J
obj
.
7.6.3 Improvements to
HMRF-KMeans
There have been multiple improvements to the initial HMRF-based prob-
abilistic generative constrained clustering framework. Lange et al. (34) in-
corporated prior knowledge from both labels on the input data instances as
well as constraints into their clustering model. They inferred the constraint
potentials in the HMRF model from a Maximum Entropy solution of
P
(
Y
)
under constraints encoded in the label and constraint set, and replaced the
ICM-based greedy assignment scheme in the E-step of HMRF-KMeans by
mean-field approximation. Lu et al. (35) proposed probabilistic EM-style
assignments instead of winner-take-all KMeans-type assignments, and used
Gibbs sampling in the E-step of their constrained EM algorithm.
Search WWH ::
Custom Search