Biomedical Engineering Reference
In-Depth Information
are constrained to have equal value and essentially no learning (or pruning) occurs.
Consequently, the standard weighted minimum
2-norm solution can be seen as a
special case.
We will often concern ourselves with flat hyperpriors when considering the
-
MAP option. In this case, the third term in Eq. ( 6.13 ) vanishes and the only regular-
ization will come from the log
ʳ
term. In this context, the optimization problemwith
respect to the unknown hyperparameters is sometimes referred to as type-II maxi-
mum likelihood. It is also equivalent to the restricted maximum likelihood (ReML)
cost function, discussed by Friston et al. [ 7 ] Regardless of how
| . |
ʳ
is optimized, once
is obtained, we compute Σ s which fully specifies our assumed empirical
prior on s . To the extent that the “learned” prior p
some
ʳ
is realistic, this posterior
quantifies regions of significant current density and point estimates for the unknown
sources can be obtained by evaluating the posterior mean.
(
s
| ʳ)
6.3.2.2 Optimization of
ʳ
- MAP
The primary objective of this section is to minimize Eq. ( 6.13 ) with respect to
ʳ
.For
simplicity, we will first present updates with f i i ) =
0 (i.e. a flat hyper prior). We
then address natural adaptations to the more general cases (e.g., not just conjugate
priors). Of course one option is to treat the problem as a general nonlinear optimiza-
tion task and perform gradient descent or some other generic procedure. In contrast,
here we will focus on methods specifically tailored for minimizing Eq. ( 6.13 )using
principled methodology. We begin with methods based directly on the EM algorithm
and then diverge to alternatives that draw on convex analysis to achieve faster con-
vergence. One approach to minimizing Eq. ( 6.13 ) is the restricted likelihood method
(ReML) which utilizes what amounts to EM-based updates treating s as hidden data.
For the E-step, the mean and covariance of s are computed given some estimate of
the hyperparameters
ʳ
. For the M-step, we then must update
ʳ
using these moments
as the true values. Unfortunately, the optimal value of
cannot be obtained in closed
form for arbitrary covariance component sets, so a second-order Fisher scoring pro-
cedure is adopted to approximate the desired solution. While effective for estimating
small numbers of hyperparameters, this approach requires inverting a d
ʳ
ʳ ×
d
Fisher
ʳ
information matrix, which is not computationally feasible for large d
. Moreover,
unlike exact EM implementations, there is no guarantee that such a Fisher scoring
method will decrease the likelihood function Eq. ( 6.13 ) at each iteration.
Consequently, here we present alternative optimization procedures that apply to
the arbitrary covariance model discussed above, and naturally guarantee that
ʳ
0
for all i . All of these methods rely on reparameterizing the generative model such
that the implicit M-step can be solved in closed form. First, we note that
ʳ i
L(ʳ)
only
depends on the data y through the d y ×
d y sample correlation matrix C y . Therefore,
to reduce the computational burden, we replace y with a matrix
d y ×
rank
(
y
)
y
∈ R
such that yy T
y T . This removes any per-iteration dependency on n , which can
potentially be large, without altering that actual cost function. It also implies that, for
purposes of computing
=
y
ʳ
(
)
, the number of columns of s is reduced to match rank
y
.
Search WWH ::




Custom Search