Database Reference
In-Depth Information
κ h
0, we obtain
n
1
n
α h =
p ( h
|
x i , Θ) ,
(6.8)
i =1
n
r h =
x i p ( h| x i , Θ) ,
(6.9)
i =1
r h
μ h =
,
(6.10)
r h
κ h )
I d/ 2 1 (
I d/ 2 (
r h
i =1 p ( h| x i , Θ) .
=
(6.11)
κ h )
Observe that (6.10) and (6.11) are intuitive generalizations of (6.4) and (6.5)
respectively, and they correspond to an M-step in an EM framework. Given
these parameter updates, we now look at schemes for updating the distribu-
tions of
, Θ) (i.e., an E-step) to maximize the likelihood of the data given
the parameters estimates above.
From the standard EM framework, the distribution of the hidden variables
(34; 11) is given by
Z|
(
X
Θ)
l =1 α l f l ( x i |
α h f h ( x i |
p ( h
|
x i , Θ) =
.
(6.12)
Θ)
It can be shown (15) that the incomplete data log-likelihood ,ln p (
Θ), is
non-decreasing at each iteration of the parameter and distribution updates.
Iteration over these two updates provides the foundation for our soft-moVMF
algorithm given in Section 6.6.
Our second update scheme is based on the widely used hard-assignment
heuristic for unsupervised learning. In this case, the distribution of the hidden
variablesisgivenby
X|
p ( h |
1 ,
if h =argmax
h
x i , Θ) ,
q ( h
|
x i , Θ) =
(6.13)
0 ,
otherwise .
It can be shown (5) that the above hard-assignment rule actually maximizes a
non-trivial lower bound on the incomplete data log-likelihood. Iteration over
the M-step and the hard-assignment rule leads to the hard-moVMF algorithm
giveninSection6.6.
6.5 Handling High-Dimensional Text Datasets
Although the mixture model outlined in Section 6.4 appears to be straight-
forward, there is one critical issue that needs to be addressed before one can
 
Search WWH ::




Custom Search