A Unified Bayesian Framework for MEG/EEG Source Imaging - Electromagnetic Brain Imaging

Biomedical Engineering Reference

In-Depth Information

6.3.2.3 Analysis of

- MAP

Previously, we have claimed that the

- MAP process naturally forces excessive/

irrelevant hyperparameters to converge to zero, thereby reducing model complexity.

Note that, somewhat counterintuitively, this occurs even when a flat hyperprior is

assumed. While this observation has been verified empirically by ourselves and

others in various application settings, there has been relatively little corroborating

theoretical evidence, largely because of the difficulty in analyzing the potentially

multimodal, non-convex

- MAP cost-function. We can then show that: Every local

minimum of the generalized

- MAP cost function, is achieved at a solution with

d y non-zero hyper parameters if f i (ʳ i )

utmost rank

is concave and non-

decreasing for all i , including flat hyper priors. Therefore, we can be confident that

the pruning mechanism of

(

)

d y

- MAP is not merely an empirical phenomena. Nor is it

dependent on a particular sparse hyperprior, the result holds when a flat (uniform)

hyperprior is assumed.

The number of observation vectors n also plays an important role in shaping

MAP solutions. Increasing n has two primary benefits: (i) it facilitates convergence

to the global minimum (as opposed to getting stuck in a suboptimal extrema) and (ii),

it improves the quality of this minimum by mitigating the effects of noise. Finally, a

third benefit to using n

1 is that it leads to temporal smoothing of estimated time

courses (i.e., rows of ˆ

s . This occurs because the selected covariance components do

not change across time, as would be the case if a separate set of hyperparameters

were estimated at each time point. For purposes of model selection, a rigorous bound

on lo

(

)

can be derived using principles from convex analysis that have been

successfully applied in general-purpose probabilistic graphical models (see Wipf

and Nagarajan [ 13 ]).

6.3.3 Source MAP or Penalized Likelihood Methods

The second option is to integrate out the unknown

, we can treat p

(

)

as the effective

prior and attempt to compute a MAP estimate of s via

argmax

(

)

(

| ʳ)

(ʳ)

ʳ =

argmax

(

)

(

)

(6.22)

While it may not be immediately transparent, solving s -MAP also leads to a

shrinking and pruning of superfluous covariance components. In short, this occurs

because the hierarchical model upon which it is based leads to a convenient, iterative

EM algorithm-based implementation, which treats the hyperparameters

as hidden

data and computes their expectation for the E-step. Over the course of learning, this

expectation collapses to zero for many of the irrelevant hyperparameters, removing

them from the model in much the same way as

- MAP .

Electromagnetic Brain Imaging

Search WWH ::

Custom Search

Home