Biomedical Engineering Reference
In-Depth Information
B.6.1
Derivation of the EM Algorithm Using the Free Energy
B.6.1.1
Derivation of Posterior Distribution (E-step)
As a preparation for introducing the variational technique, we derive the EM algo-
rithm in a different manner based on an optimization of a functional called the free
energy. In this section, the hyperparameters are collectively expressed as
ʸ
. We define
a functional such that,
d
x
q
F
[
q
(
x
),
ʸ
]=
(
x
)
[
log
p
(
x
,
y
|
ʸ
)
−
log
q
(
x
)
]
.
(B.51)
This
F
[
q
(
x
),
ʸ
]
is a function of hyperparameters
ʸ
and an arbitrary probability distri-
(
)
F
[
(
),
ʸ
]
bution
q
is called the free energy using a terminology in statistical
physics. We show, in the following, that maximizing the free energy
x
.This
q
x
F
[
(
),
ʸ
]
q
x
with
respect to
q
results in the E step, and maximizing it with respect to the hyperpa-
rameters results in the M step of the EM algorithm.
When maximizing
(
x
)
F
[
q
(
x
),
ʸ
]
with respect to
q
(
x
)
, since
q
(
x
)
is a probability
distribution, the constraint
∞
−∞
1 must be imposed. Therefore, this maxi-
mization problem can be formulated such that,
q
(
x
)
d
x
=
subject to
∞
−∞
(
)
=
F
[
(
),
ʸ
]
,
(
)
=
.
q
x
argmax
q
q
x
q
x
d
x
1
(B.52)
(
x
)
Such a constrained optimization problem can be solved by using the method of
Lagrange multipliers, in which defining the Lagrange multiplier as
ʳ
, the Lagrangian
is defined as
∞
1
L[
q
,ʳ
]=
F
[
q
,
ʸ
]+
ʳ
q
(
x
)
d
x
−
−∞
∞
1
∞
=
d
x
q
(
x
)
[
log
p
(
x
,
y
|
ʸ
)
−
log
q
(
x
)
]+
ʳ
q
(
x
)
d
x
−
.
(B.53)
−∞
−∞
The constrained optimization problem in Eq. (
B.52
) is now rewritten as the uncon-
strained optimization problem in Eq. (
B.53
). The probability distribution
q
(
)
x
that
L[
,ʳ
]
maximizes the Lagrangian
q
is the solution of the constrained optimization
problem in Eq. (
B.52
).
Differentiating
L[
q
,ʳ
]
with respect to
q
(
x
)
, and setting the derivative to zero, we
have
ʴ
L[
q
(
x
), ʳ
]
=
log
p
(
x
,
y
|
ʸ
)
−
log
q
(
x
)
−
1
+
ʳ
=
0
.
(B.54)
ʴ
q
(
x
)
A brief explanation on the differentiation of a functional, as well as the derivation of
Eq. (
B.54
), is presented in Sect.
C.5
in the Appendix. Differentiating
L[
q
,ʳ
]
with
respect to
ʳ
gives