Individual Neurons - Computational Explorations in Cognitive Neuroscience

Information Technology Reference

In-Depth Information

than just a few inputs are present, a table like that in fig-

ure 2.21 becomes intractably large due to the huge num-

ber of different unique combinations of input states. For

example, if the inputs are binary (which is not actually

true for neurons, so it's even worse), the table requires

that we need not be too concerned with the value of z .

First, we want to emphasize what has been done here.

Equation 2.33 means that input patterns d become

more probable when there is activity on input sources

that are thought to reflect the presence of something

of interest in the world, as parameterized by the weight

value w i . Thus, if w i = 1 , we care about that in-

put, but if it is 0, we don't care (because it is not rel-

evant to our hypothesis). Furthermore, the overall like-

lihood is just the (normalized) sum of all these individ-

ual source-level contributions — our detector does not

represent interactions among the inputs. The beauty of

the Bayesian framework is that it enables us to use these

definitions (or any others that we might also find plausi-

ble), to then compute in a rational manner the extent

to which we should believe a given hypothesis to be

true in the context of a particular data input. Of course,

garbage-in gives garbage-out, so the whole thing rests

on how good (plausible) the likelihood definition is.

In effect, what we have done with equation 2.33 is

provided a definition of exactly what the hypothesis h

is, by explicitly stating how likely any given input pat-

tern would be assuming this hypothesis were true. The

fact that we are defining probabilities, not measuring

them, makes these probabilities subjective. They no

longer correspond to frequencies of objectively measur-

able events in the world. Nevertheless, by working out

our equations in the previous section as if we had ob-

jective probabilities, and establishing a self-consistent

mathematical framework via Bayes formula, we are as-

sured of using our subjective probabilities in the most

“rational” way possible.

The objective world defined by the state table in fig-

ure 2.21 corresponds to the definition of the likelihood

given by equation 2.33, because the frequency (objec-

tive probability) of each input state when the hypothesis

entries for n inputs, with the extra factor of two

(accounting for the +1 in the exponent) reflecting the

fact that all possibilities must be considered twice, once

under each hypothesis. This is roughly 1:1x10 301

n +1

for

just 1,000 inputs (and our calculator gives Inf as a re-

sult if we plug in a conservative guess of 5,000 inputs

for a cortical neuron). This is the main reason why we

need to develop subjective ways of computing probabil-

ities.

As we have stated, the main way we avoid using a ta-

ble of objective probabilities is to use likelihood terms

that can be computed directly as a function of the input

data and the specification of the hypothesis , without ref-

erence to objective probabilities and the requisite table.

When we directly compute a likelihood function, we ef-

fectively make a set of assumptions about the nature of

the hypothesis and its relationship with the data, and

then compute the likelihood under these assumptions.

In general, we have no way of validating these assump-

tions (which would require the intractable table), so we

must instead evaluate the plausibility of the assumptions

and their relationship to the hypotheses.

One plausible assumption about the likelihood func-

tion for a detector is that it is directly (linearly) pro-

portional to the number of inputs that match what the

detector is trying to detect. Thus, we use a set of pa-

rameters to specify to what extent each input source is

representative of the hypothesis that something interest-

ing is “out there.” These parameters are just our stan-

dard weight parameters w . Together with the linear pro-

portionality assumption, this gives a likelihood function

that is a normalized linear function of the weighted in-

puts:

is true is proportional to the number of active inputs in

that state — this is exactly what our assumption was in

constructing equation 2.33. As you can verify yourself,

the equation for the likelihood in the objective world is

(2.33)

where d i is the value of one input source i (e.g., d i =1

if that source detected something, and 0 otherwise), and

the normalizing term z ensures that the result is a valid

probability between 0 and 1. We will see in a moment

(2.34)

where we assume that the weights for the 3 input

sources are all 1 (figure 2.23). To illustrate the impor-

Computational Explorations in Cognitive Neuroscience

Search WWH ::

Custom Search

Home