Higher-Level Cognition - Computational Explorations in Cognitive Neuroscience

Information Technology Reference

In-Depth Information

the value for that unit. Values of exactly 0 indicate no

reward information at all, not the absence of reward.)

Press act in the network window to view the plus

phase activations, where you can see the active PFC

units. Press StepSettle several times, observing the

AC unit activation in the minus and plus phases. Stop

when you see the other dimension getting activated in

the input — the network is moving on to the second train-

ing block at this point.

You should notice for the next several trials that the

network continues to perform correctly, so that the AC

unit is activated in the plus phase, but it is not activated

in the minus phase because that depends on learning to

predict or expect the reward. This learning is taking

place, and early in the next block it will come to expect

this reward in the minus phase.

At this point, the network is responding correctly to

the location of the first feature, and so has completed

the first training block. However, you may have noticed

that there is not necessarily a clear correspondence in

the feature-level PFC to this target stimulus for the ini-

tial training block. This is because the second stimu-

lus is actually perfectly anticorrelated with the first one

in location, so the network could just as easily learn

“press in the opposite direction of the second stimulus.”

Monkeys (and people) probably have biases to learn the

more direct “press in the same direction as the first stim-

ulus,” but this bias is not captured in the model. Never-

theless, this bias turns out not to be critical, as we shall

see.

To monitor the performance of the network on this

task, it is useful to check the Epoch_0_GraphLog ,

which contains a plot of the error over epochs. You can

see that the network error has descended to zero during

this first training block, and completed one additional

epoch with zero error, indicating that it is time to move

on to the next block.

We next present the second training block, where the

features from the other dimension ( B ) are included, but

the target remains the same (the first feature in dimen-

sion A ). The network has no difficulty with this prob-

lem. It should complete the block to criterion after the

minimum of 2 epochs. During this time, the AC unit

will come to strongly expect the reward signal in the

minus phase.

, !

Press StepSettle again to get to the second plus

phase. Note that the network window displays all the

counter information, so you can more easily keep track

of where you are — phase_no should be 2 now.

Recall from chapter 9 that this second plus phase is

viewed as part of one overall plus phase, but gets sep-

arated out for ease of determining first the change in

the AC unit activation over the initial minus-plus phase

set, and then the consequences for the PFC. You should

see that in this case the PFC units are not updated, be-

cause the temporal differences TD signal across the AC

unit minus and plus phases was zero, meaning that the

hidden-to-PFC weights were not transiently strength-

ened. Note that there is noise added on top of this

basic TD signal, but, because we are using the same

set of random numbers (via the ReInit function), we

know that this noise was negative and did not activate

the PFC.

You can see the actual TD values used for gating the

PFC units in the PFC_td values displayed near the

respective PFC layers (both should be negative). Al-

though random noise in the gating signal can activate

the PFC units by chance, it turns out that on this run,

the units will only get activated when the network pro-

duces the correct output.

, !

Press StepTrial (which steps through all three

phases of one event) a few times, until you see the PFC

units get activated.

When the network produces the correct output, the

AC unit then receives a positive reward value in the first

plus phase. To see that the network did produce the

correct output in the minus phase, we can look at the

minus phase activations.

, !

Press act_m in the network window to view the

minus phase activations.

You should see that the right output unit is active

above .5. The reward for producing the correct out-

put that is provided to the AC unit produces a pos-

itive TD signal, which modulates the strength of the

hidden-to-PFC weights by a large amount as shown in

the PFC_td values, which should both be around 1.

This large input strength causes the PFC units to now

represent the most strongly active hidden units.

, !

StepTrial through the second training block, stop-

ping when the patterns in the input change (you can

, !

Computational Explorations in Cognitive Neuroscience

Search WWH ::

Custom Search

Home