Information Technology Reference
In-Depth Information
For fixed sample set,
) are both constants. Take logarithm on the
both sides of equation (6.58). We have:
p
( ȶ ) and
p
(
D
l
(
θ
|
D
)
=
log
p
(
θ
|
D
)
p
( )
θ
|
C
|
à Ã
=
log
+
log
p c
(
|
θ
)
p
(
d
|
c
,
θ
)
(6.59)
j
i
j
p D
(
)
d
D
j
=
1
i
U
Ã
+
log
p c
( (
d
) |
θ
)
p
(
d
|
c
(
d
),
θ
)
i
i
i
d
D
i
L
To label the unlabeled documents, we need latent variables in LSA. Here we
introduce
k
latent variables
Z
= {
z 1 , z 2 , …, z k }, where each latent variable is a
n
-
dimensional vector
z i = <
z i1 , z i2 , …, z in >, and if
c
( d j ) =
c i
then
z ij
= 1, otherwise
z ij
= 0. So equation (6.59) can be rewritten as follows:
|
D
|
|
C
|
p
( )
θ
+ ÃÃ
(
θ
|
)
=
log
log
(
|
θ
)
(
d
|
,
θ
)
l
D
z
p c
p
c
(6.60)
ji
j
i
j
j
p D
(
)
i
=
1
j
=
1
In equation (6.59),
z ji for labeled documents is known. The learning task is to
maximize model parameters and to estimate
of unlabeled documents.
Here we still apply EM algorithm to learn knowledge about unlabeled
documents. Yet the process is somewhat different from the previous stage. In the
kth iteration in the E step, we will use naïve Bayesian classifier to find the class
label of unlabeled documents based on the current estimation of parameters.
z ji
m
p c
(
|
θ
K
)
p w
(
|
c
;
θ
k
)
j
r
j
k
j
?
1
,
,k
p d
(
|
c
,
θ
)
=
r
=
1
,
j
Ã
k
m
k
k
p c
(
|
θ
)
p w
(
|
c
;
θ
)
i
r
i
i
=
1
r
=
1
The class
c i corresponding to MAP is the expected label of the unlabeled
documents:
)
In the step M, we maximize the estimation of current parameters based on the
expectation obtained from the just previous E step.
z id = 1,
z jd = 0 (
j
i
|
D
|
à =
z
ji
θ
=
p
(
c
|
θ
)
=
i
1
D
(6.61a)
c j
j
|
|
Ã
|
D
i
|
α
+
n
(
d
,
w
)
z
j
i
t
ji
θ
=
p
(
w
|
c
,
θ
)
=
=
1
(6.61b)
w
|
c
t
j
à Ã
m
k
|
D
i
|
t
j
α
+
n
(
d
,
w
)
z
0
i
k
ji
=
1
=
1
Organizing Web information into catalogs is an effective way to improve the
effectiveness and efficiency of information retrieval. It can be achieved by
learning classifiers with labeled documents and predicting class label of new
Search WWH ::




Custom Search