Information Technology Reference
In-Depth Information
Note that r nk gives the responsibility that is assigned to classifier k for model-
ling observation n , and is proportional to ρ nk (7.62). Thus, the responsibilities
are on one hand proportional to the current mixing weights g k ( x ),andonthe
other hand are higher for low-variance classifiers (note that τ k is the inverse
variance of classifier k ) that feature a low expected squared prediction error
( y nj
w kj x n ) 2 for the associated observation. Overall, the responsibilities are
distributed such that the observations are modelled by the classifiers that are
best at modelling them.
7.3.7
Required Moments of the Variational Posterior
Some of the variational distribution parameters require evaluation of the mo-
ments of one or the other random variable in our probabilistic model. In this
section, these moments and the ones required at a later stage are evaluated.
Throughout this section we use E x ( x )= x and cov x ( x , x )= Λ 1 ,where
x
( x , Λ 1 ) is a random vector that is distributed according to a multi-
variate Gaussian with mean x and covariance matrix Λ 1 .
Given that we have a random variable X
∼N
Gam( a, b ), then its expectation
is
E X ( X )= a/b , and the expectation of its logarithm is
E X (ln X )=
ψ
( a )
ln b ,
( x )= d x ln Γ( x ) is the digamma function [19]. Thus the following are
the posterior moments for q α ( α k ), q β ( β k ), and q τ ( τ k ):
where
ψ
E α ( α k )= a α k
b α k
,
(7.64)
( a α k )
E α (ln α k )=
ψ
ln b α k ,
(7.65)
E β ( β k )= a β k
b β k
,
(7.66)
( a β k )
ln b β k ,
E β (ln β k )=
ψ
(7.67)
E τ ( τ k )= a τ k
b τ k
,
(7.68)
( a τ k )
ln b τ k .
E τ (ln τ k )=
ψ
(7.69)
To get the moments of q W,τ ( W k k )and q V ( v k ), we can use var( X )=
E
( X 2 )
E
( X ) 2 , and thus,
E
( X 2 )=var( X )+
E
( X ) 2 ,toget
( x T x )=
( x i )
E
i E
=
i
var( x i )+
( x i ) 2
i E
( x ) T
=Tr(cov( x , x )) +
E
E
( x ) ,
and similarly,
( xx T )=cov( x , x )+
( x ) T ,
E
E
( x )
E
 
Search WWH ::




Custom Search