Basics of Bayesian Inference - Electromagnetic Brain Imaging

Biomedical Engineering Reference

In-Depth Information

maximizing the marginal likelihood, we describe an algorithm that maximizes a

quantity called the average data likelihood to obtain estimates for the hyperparame-

ters. This algorithm is called the expectation maximization (EM) algorithm, and is

described in the following.

B.5.2

Average Data Likelihood

The EMalgorithm computes the quantity called the average data likelihood. Comput-

ing the average data likelihood ismuch easier than computing themarginal likelihood.

To define the average data likelihood, let us first define the complete data likelihood,

such that

log p

(

| ʦ , ʛ ) =

log p

(

, ʛ ) +

log p

(

| ʦ ).

(B.33)

If we observed not only y but also x , we could have estimated

and

bymaximizing

log p

with respect to these hyperparameters. However, since we do not

observe x , we must substitute for the unknown x in log p

(

| ʦ , ʛ )

(

| ʦ , ʛ )

with some

“reasonable” value.

Having observed y , we actually know which values of x are reasonable, and our

best knowledge on the unknown x is represented by the posterior distribution p

Thus, the “reasonable” value would be the one that maximizes the posterior proba-

bility, and one solution would be to use the MAP estimate of x in log p

(

)

(

| ʦ , ʛ )

better solution would be to use all possible values of x in the complete data likelihood

and average over it with the posterior probability. This results in the average data

likelihood,

ʘ( ʦ , ʛ )

ʘ( ʦ , ʛ ) =

(

)

log p

(

| ʦ , ʛ )

d x

E log p

| ʦ , ʛ )

(

E log p

, ʛ ) +

E log p

| ʦ ) ,

(

(B.34)

where the expectation E

[·]

is taken with respect to the posterior probability p

(

)

The estimates of the hyperparameters,

and

are obtained using

ʛ =

ʘ( ʦ , ʛ ),

argmax

(B.35)

ʦ =

argmax

ʘ( ʦ , ʛ ).

(B.36)

are expressed

in Eqs. ( B.17 ) and ( B.15 ), respectively. Substituting Eqs. ( B.17 ) and ( B.15 )into

( B.33 ), the complete data likelihood is expressed as 2

In the Gaussian model discussed in Sect. B.3 , p

(

| ʦ )

and p

(

, ʛ )

2 The constant terms containing 2 ˀ are ignored here.

Electromagnetic Brain Imaging

Search WWH ::

Custom Search

Home