Database Reference
In-Depth Information
κ
h
≥
0, we obtain
n
1
n
α
h
=
p
(
h
|
x
i
,
Θ)
,
(6.8)
i
=1
n
r
h
=
x
i
p
(
h|
x
i
,
Θ)
,
(6.9)
i
=1
r
h
μ
h
=
,
(6.10)
r
h
κ
h
)
I
d/
2
−
1
(
I
d/
2
(
r
h
i
=1
p
(
h|
x
i
,
Θ)
.
=
(6.11)
κ
h
)
Observe that (6.10) and (6.11) are intuitive generalizations of (6.4) and (6.5)
respectively, and they correspond to an M-step in an EM framework. Given
these parameter updates, we now look at schemes for updating the distribu-
tions of
,
Θ) (i.e., an E-step) to maximize the likelihood of the data given
the parameters estimates above.
From the standard EM framework, the distribution of the hidden variables
(34; 11) is given by
Z|
(
X
Θ)
l
=1
α
l
f
l
(
x
i
|
α
h
f
h
(
x
i
|
p
(
h
|
x
i
,
Θ) =
.
(6.12)
Θ)
It can be shown (15) that the
incomplete data log-likelihood
,ln
p
(
Θ), is
non-decreasing at each iteration of the parameter and distribution updates.
Iteration over these two updates provides the foundation for our
soft-moVMF
algorithm given in Section 6.6.
Our second update scheme is based on the widely used hard-assignment
heuristic for unsupervised learning. In this case, the distribution of the hidden
variablesisgivenby
X|
⎧
⎨
p
(
h
|
1
,
if
h
=argmax
h
x
i
,
Θ)
,
q
(
h
|
x
i
,
Θ) =
(6.13)
⎩
0
,
otherwise
.
It can be shown (5) that the above hard-assignment rule actually maximizes a
non-trivial lower bound on the incomplete data log-likelihood. Iteration over
the M-step and the hard-assignment rule leads to the
hard-moVMF
algorithm
giveninSection6.6.
6.5 Handling High-Dimensional Text Datasets
Although the mixture model outlined in Section 6.4 appears to be straight-
forward, there is one critical issue that needs to be addressed before one can
Search WWH ::
Custom Search