Constrained Partitional Clustering of Text Data: An Overview - Text Mining: Classification, Clustering, and Applications

Database Reference

In-Depth Information

J obj =

x i ∈X

D ( x i ,μ y i )+

c ij ∈C

v ( i, j )

−

log P(Θ) + log Z + n log Z Θ (7.20)

Basu et al. (6) used Rayleigh priors for P(Θ), and they ignored the nor-

malizer terms. An optimal clustering is obtained by minimizing

J obj over

the hidden variables Y and parameters Θ, which are comprised of cluster

centroids M and distance measure parameters A (note that given the cluster

assignments Y ,themeans M =

K

{

μ i }

i =1 are uniquely determined).

7.6.2 EM Algorithm

As discussed in Section 7.6.1, Basu et al. (6) minimize

J obj using a K-Means-

type iterative algorithm HMRF-KMeans. The outline of the algorithm is

presented in Figure 7.10 . The basic idea of HMRF-KMeans is as follows:

the constraints are used to obtain a good initialization of the clustering. Then

in the E-step, given the current cluster representatives, every data instance is

re-assigned to the cluster that minimizes its contribution to

J obj .TheE-step

of HMRF-KMeans uses an Iterated Conditional Modes (ICM) approach,

which is a greedy strategy to sequentially update the cluster assignment of

each instance, keeping the assignments for the other instances fixed. In the

M-step, the cluster representatives M =( μ 1 ,...,μ K ) are re-estimated from

the cluster assignments to minimize

J obj for the current assignment. The

clustering distance measure is subsequently updated in the M-step to reduce

the objective function by modifying the parameters of the distance measure.

Note that this corresponds to the generalized EM algorithm (38; 18), where

the objective function is reduced but not necessarily minimized in the M-step.

Effectively, the E-step minimizes J obj over cluster assignments Y ,theM-step

(A) minimizes J obj over cluster representatives M , and the M-step (B) reduces

J obj over the parameters of the distance measure. The E-step and the M-step

are repeated till a specified convergence criterion is reached. Basu et al. (6)

show that HMRF-KMeans converges to a local optimum of

J obj .

7.6.3 Improvements to HMRF-KMeans

There have been multiple improvements to the initial HMRF-based prob-

abilistic generative constrained clustering framework. Lange et al. (34) in-

corporated prior knowledge from both labels on the input data instances as

well as constraints into their clustering model. They inferred the constraint

potentials in the HMRF model from a Maximum Entropy solution of P ( Y )

under constraints encoded in the label and constraint set, and replaced the

ICM-based greedy assignment scheme in the E-step of HMRF-KMeans by

mean-field approximation. Lu et al. (35) proposed probabilistic EM-style

assignments instead of winner-take-all KMeans-type assignments, and used

Gibbs sampling in the E-step of their constrained EM algorithm.

Text Mining: Classification, Clustering, and Applications

Search WWH ::

Custom Search

Home