Information Technology Reference
In-Depth Information
7.3.8
The Variational Bound
L
( q )
We are most interested in finding the value for
( q ) by (7.21), as it provi-
des us with an approximated lower bound on the logarithm of the model evi-
dence ln p ( Y ), and is the actual expression that is to be maximised. Evaluating
(7.21) by using the distribution decomposition according to (7.15), the variatio-
nal bound is given by
L
( q )=
q ( U )ln p ( Y , U )
q ( U )
L
d U
=
E W,τ,α,Z,V,β (ln p ( Y , W , τ , Z , V , β ))
E W,τ,α,Z,V,β (ln q ( W , τ , α , Z , V , β ))
=
E W,τ,Z (ln p ( Y
|
W , τ , Z )) +
E W,τ,α (ln p ( W , τ
|
α )) +
E α (ln p ( α ))
+
E Z,V (ln p ( Z
|
V )) +
E V,β (ln p ( V
|
β )) +
E β (ln p ( β ))
E W,τ (ln q ( W , τ ))
E α (ln q ( α ))
E Z (ln q ( Z ))
E V (ln q ( V ))
E β (ln q ( β )) ,
(7.75)
where all expectations are taken with respect to the variational distribution q .
These are evaluated one by one, using the previously derived moments of the
variational posteriors.
To derive
E W,τ,Z (ln p ( Y
|
W , τ , Z )), we use (7.6) and (7.7) to get
E W,τ,Z (ln p ( Y
|
W , τ ))
=
n
E Z ( z nk )
j
w kj x n 1
E W,τ (ln
N
( y nj |
))
k
k
1
2 E τ (ln τ k )
w kj x n ) 2 )
=
n
r nk
j
1
2 ln 2 π
1
2 E W,τ ( τ k ( y nj
k
D 2
=
k
ln 2 π
n
ψ
( a τ k )
ln b τ k
r nk
w kj T x n ) 2 + x n Λ k 1 x n
a τ k
b τ k
2
n
r nk
j
1
( y nj
D 2
=
k
ln 2 π
n
ψ
( a τ k )
ln b τ k
r nk
2 + D Y x n Λ k 1 x n .
r nk a τ k
2
n
1
W k x n
b τ k
y n
(7.76)
The classifier model parameters expectation
E W,τ,α (ln p ( W , τ
|
α )) can be de-
rived by using (7.7) and (7.16), and is given by
E W,τ,α (ln p ( W , τ
|
α ))
(7.77)
=
k
E W,τ,α (ln
a τ ,b τ )) .
0 , ( α k τ k ) 1 I )) +
N
( w kj |
E τ (ln Gam( τ k |
j
Search WWH ::




Custom Search