Text Clustering with Mixture of von Mises-Fisher Distributions - Text Mining: Classification, Clustering, and Applications

Database Reference

In-Depth Information

κ h ≥

0, we obtain

n

1

n

α h =

p ( h

|

x i , Θ) ,

(6.8)

i =1

n

r h =

x i p ( h| x i , Θ) ,

(6.9)

i =1

r h

μ h =

,

(6.10)

r h

κ h )

I d/ 2 − 1 (

I d/ 2 (

r h

i =1 p ( h| x i , Θ) .

=

(6.11)

κ h )

Observe that (6.10) and (6.11) are intuitive generalizations of (6.4) and (6.5)

respectively, and they correspond to an M-step in an EM framework. Given

these parameter updates, we now look at schemes for updating the distribu-

tions of

, Θ) (i.e., an E-step) to maximize the likelihood of the data given

the parameters estimates above.

From the standard EM framework, the distribution of the hidden variables

(34; 11) is given by

Z|

(

X

Θ)

l =1 α l f l ( x i |

α h f h ( x i |

p ( h

|

x i , Θ) =

.

(6.12)

Θ)

It can be shown (15) that the incomplete data log-likelihood ,ln p (

Θ), is

non-decreasing at each iteration of the parameter and distribution updates.

Iteration over these two updates provides the foundation for our soft-moVMF

algorithm given in Section 6.6.

Our second update scheme is based on the widely used hard-assignment

heuristic for unsupervised learning. In this case, the distribution of the hidden

variablesisgivenby

X|

⎧

⎨

p ( h |

1 ,

if h =argmax

h

x i , Θ) ,

q ( h

|

x i , Θ) =

(6.13)

⎩

0 ,

otherwise .

It can be shown (5) that the above hard-assignment rule actually maximizes a

non-trivial lower bound on the incomplete data log-likelihood. Iteration over

the M-step and the hard-assignment rule leads to the hard-moVMF algorithm

giveninSection6.6.

6.5 Handling High-Dimensional Text Datasets

Although the mixture model outlined in Section 6.4 appears to be straight-

forward, there is one critical issue that needs to be addressed before one can

Text Mining: Classification, Clustering, and Applications

Search WWH ::

Custom Search

Home