Individual Neurons - Computational Explorations in Cognitive Neuroscience

Information Technology Reference

In-Depth Information

Assuming that we have a likelihood function that can

be computed directly, we would like to be able to write

equation 2.23 in terms of these likelihood functions.

The following algebraic steps take us there. First, we

note that the definition of the likelihood (equation 2.25)

gives us a new way of expressing the joint probability

term that appears in equation 2.23:

As we did before to get Bayes formula, these joint

probabilities can be turned into conditional probabilities

with some simple algebra on the conditional probability

definition (equation 2.23), giving us the following:

(2.30)

which can then be substituted into Bayes formula, re-

sulting in:

(2.27)

which can be substituted back into equation 2.23, giv-

ing:

(2.31)

This is now an expression that is strictly in terms of just

the likelihoods and priors for the two hypotheses! In-

deed, this is the equation that we showed at the out-

set ( equation 2 .2 2), w ith f(h; d) = P (djh)P (h) and

(2.28)

This last equation is known as Bayes formula ,and

it provides the starting point for a whole field known

as Bayesian statistics. It allows you to write P (hjd) ,

which is called the posterior in Bayesian terminology,

in terms of the likelihood times the prior , which is what

form,

which reflects a balancing of the likelihood in favor of

the hypothesis with that against it. It is this form that

the biological properties of the neuron implement, as

we will see more explicitly in a subsequent section.

Before continuing, let's verify that equation 2.31 pro-

duces the same result as equation 2.23 for the case we

have been considering all along ( P (h =1jd = 110) ).

First, we know that the likelihood P (d = 110jh = 1)

according to the ta bl e is (2/24) / (12/24) or .167. Also,

. It has a very simple

h + h

is called. The prior basically indicates how likely

the hypothesis is to be true without having seen any data

at all — some hypotheses are just more plausible (true

more often) than others, and this can be reflected in this

term. Priors are often used to favor simpler hypotheses

as more likely, but this is not necessary. In our appli-

cation here, the prior terms will end up being constants,

which can actually be measured (at least approximately)

from the underlying biology.

As in equation 2.23, the likelihood times the prior is

normalized by the probability of the data P (d) in Bayes

formula. We can replace P (d) with an expression in-

volving only likelihood a nd prior terms if we make use

of our null hypothesis h . Again, we want to use like-

lihood terms because they can often be computed di-

rectly. Because our hypothesis and null hypothesis are

mutually exclusive and sum to 1, we can write the prob-

ability of the data in terms of the part of it that overlaps

with the hypothesis plus the part that overlaps with the

null hypothesis:

,and P (h)=:5 as well. The only other thing

we need is P (djh) , which we can see from the table is

(1/24)/(12/24) or .083. The result is thus:

(2.32)

So, we can see that this result agrees with the previously

computed value. Obviously, if you have the table, this

seems like a rather indirect way of computing things,

but we will see in the next section how the likelihood

terms can be computed without a table.

(2.29)

2.7.2

Subjective Probabilities

In figure 2.21, this amounts to computing P (d) in the

top and bottom halves separately, and then adding these

results to get the overall result.

Everything we just did was quite straightforward be-

cause we had a world state table, and could therefore

compute objective probabilities. However, when more

Computational Explorations in Cognitive Neuroscience

Search WWH ::

Custom Search

Home