Biomedical Engineering Reference
In-Depth Information
are constrained to have equal value and essentially no learning (or pruning) occurs.
Consequently, the standard weighted minimum
2-norm solution can be seen as a
special case.
We will often concern ourselves with flat hyperpriors when considering the
-
MAP
option. In this case, the third term in Eq. (
6.13
) vanishes and the only regular-
ization will come from the log
ʳ
term. In this context, the optimization problemwith
respect to the unknown hyperparameters is sometimes referred to as type-II maxi-
mum likelihood. It is also equivalent to the restricted maximum likelihood (ReML)
cost function, discussed by Friston et al. [
7
] Regardless of how
|
.
|
ʳ
is optimized, once
is obtained, we compute
Σ
s
which fully specifies our assumed empirical
prior on
s
. To the extent that the “learned” prior
p
some
ʳ
is realistic, this posterior
quantifies regions of significant current density and point estimates for the unknown
sources can be obtained by evaluating the posterior mean.
(
s
|
ʳ)
6.3.2.2 Optimization of
ʳ
-
MAP
The primary objective of this section is to minimize Eq. (
6.13
) with respect to
ʳ
.For
simplicity, we will first present updates with
f
i
(ʳ
i
)
=
0 (i.e. a flat hyper prior). We
then address natural adaptations to the more general cases (e.g., not just conjugate
priors). Of course one option is to treat the problem as a general nonlinear optimiza-
tion task and perform gradient descent or some other generic procedure. In contrast,
here we will focus on methods specifically tailored for minimizing Eq. (
6.13
)using
principled methodology. We begin with methods based directly on the EM algorithm
and then diverge to alternatives that draw on convex analysis to achieve faster con-
vergence. One approach to minimizing Eq. (
6.13
) is the restricted likelihood method
(ReML) which utilizes what amounts to EM-based updates treating
s
as hidden data.
For the E-step, the mean and covariance of
s
are computed given some estimate of
the hyperparameters
ʳ
. For the M-step, we then must update
ʳ
using these moments
as the true values. Unfortunately, the optimal value of
cannot be obtained in closed
form for arbitrary covariance component sets, so a second-order Fisher scoring pro-
cedure is adopted to approximate the desired solution. While effective for estimating
small numbers of hyperparameters, this approach requires inverting a
d
ʳ
ʳ
×
d
Fisher
ʳ
information matrix, which is not computationally feasible for large
d
. Moreover,
unlike exact EM implementations, there is no guarantee that such a Fisher scoring
method will decrease the likelihood function Eq. (
6.13
) at each iteration.
Consequently, here we present alternative optimization procedures that apply to
the arbitrary covariance model discussed above, and naturally guarantee that
ʳ
0
for all
i
. All of these methods rely on reparameterizing the generative model such
that the implicit M-step can be solved in closed form. First, we note that
ʳ
i
L(ʳ)
only
depends on the data
y
through the
d
y
×
d
y
sample correlation matrix
C
y
. Therefore,
to reduce the computational burden, we replace
y
with a matrix
d
y
×
rank
(
y
)
y
∈ R
such that
yy
T
y
T
. This removes any per-iteration dependency on
n
, which can
potentially be large, without altering that actual cost function. It also implies that, for
purposes of computing
=
y
ʳ
(
)
, the number of columns of
s
is reduced to match rank
y
.