Biomedical Engineering Reference
In-Depth Information
L[
q
(
x
), ʳ ]
=
q
(
x
)
d x
1
=
0
.
(B.55)
∂ʳ
−∞
Thus, using Eq. ( B.54 ), we have
e ʳ 1 p
q
(
x
) =
(
x
,
y
| ʸ ),
(B.56)
and using Eq. ( B.55 ), we also have
e ʳ 1
−∞
e ʳ 1 p
q
(
x
)
d x
=
p
(
x
,
y
| ʸ )
d x
=
(
y
| ʸ ) =
1
.
(B.57)
−∞
We thereby get
1
e ʳ 1
=
| ʸ ) .
(B.58)
p
(
y
Substitution of Eq. ( B.58 )into( B.56 ) results in the relationship
p
(
x
,
y
| ʸ )
q
(
x
) =
=
p
(
x
|
y
).
(B.59)
p
(
y
| ʸ )
The above equation shows that the probability distribution that maximizes the free
energy
.
Note that to derive the posterior distribution, we do not explicitly use Bayes' rule.
Instead, we use the optimization of the functional called the free energy. This idea is
further extended in variational Bayesian inference in the following sections to derive
an approximate posterior distribution in a more complicated situation.
F [
q
(
x
), ʸ ]
is the posterior distribution p
(
x
|
y
)
B.6.1.2
Derivation of M-step
We next maximize the free energywith respect to the hyperparameter
ʸ
. Once
F [
q
, ʸ ]
is maximized with respect to q
(
x
)
, the free energy is written as
d x p
F [
p
(
x
|
y
), ʸ ]=
(
x
|
y
) [
log p
(
x
,
y
| ʸ )
log p
(
x
|
y
) ]
= ʘ( ʸ ) + H [
p
(
x
|
y
) ] ,
(B.60)
where
d x p
ʘ( ʸ ) =
(
x
|
y
)
log p
(
x
,
y
| ʸ ).
(B.61)
This
ʘ( ʸ )
is equal to the average data likelihood, and
d x p
H [
p
(
x
|
y
) ]=−
(
x
|
y
)
log p
(
x
|
y
)
(B.62)
 
Search WWH ::




Custom Search