Information Technology Reference
In-Depth Information
where H is the Hessian matrix of E ( V ) as used in the IRLS algorithm. Overall,
the Laplace approximation to the posterior q V ( V ) is given by the multivariate
Gaussian
V , Λ V 1 ) , (7.51)
where V is the solution to (7.47), and Λ V is the Hessian matrix evaluated at
V .
q V ( V )
≈N
( V
|
Mixing Weight Priors q β
7.3.5
( β )
By (7.19), p ( β ) factorises with respect to k , and thus allows us to find q β ( β )
for each classifier separately, which, by (7.15), (7.18) and (7.24), requires the
evaluation of
ln q β ( β k )=
E V (ln p ( v k |
β k )) + ln p ( β k ) .
(7.52)
Using (7.13) and (7.14), the expectation and log-density are given by
D V
2
β k
2 E V ( v k v k )+const. ,
E V (ln p ( v k |
β k )) =
ln β k
(7.53)
ln p ( β k )=( a β
1) ln β k
β k b β + const.
(7.54)
Combining the above, we get the variational posterior
ln q β ( β k )= a β
ln β k
b β + 1
2 E V ( v k v k ) b β +const.
1+ D V
2
a β k ,b β k ) ,
=lnGam( β k |
(7.55)
with the distribution parameters
= a β + D V
2
a β k
,
(7.56)
= b β + 1
b β k
2 E V ( v k v k ) .
(7.57)
As the priors on v k are similar to the ones on w k , they cause the same effect:
as b β k increases proportionally to the expected size
2 , the expectation of the
v k
E β ( β k )= a β k /b β k decreases in proportion to it. This expectation de-
termines the shrinkage on v k (see (7.47)), and thus, the strength of the shrinkage
prior is reduced if v k is expected to have large elements, which is an intuitively
sensible procedure.
posterior
Latent Variables q Z
( Z )
7.3.6
To get the variational posterior over the latent variables Z we need to evaluate
(7.24) by the use of (7.15), that is,
ln q Z ( Z )=
E W,τ (ln p ( Y
|
W , τ , Z )) +
E V (ln p ( Z
|
V )) + const.
(7.58)
Search WWH ::




Custom Search