Combined Model and Task Learning, and Other Mechanisms - Computational Explorations in Cognitive Neuroscience

Information Technology Reference

In-Depth Information

Sequence and Temporally Delayed Learning

At this point, feel free to explore the many parame-

ters available, and see how the network responds. After

you change any of the parameters, be sure to press the

MakeEnv button to make a new environment based on

these new parameters.

Finally, we can present some of the limitations of the

CSC representation. One obvious problem is capacity

— each stimulus requires a different set of units for all

possible time intervals that can be represented. Also,

the CSC begs the question of how time is initialized to

zero at the right point so every trial is properly synchro-

nized. Finally, the CSC requires that the stimulus stay

on (or some trace of it, which you can manipulate using

the tr parameter) up to the point of reward, which is

unrealistic. This last problem points to an important is-

sue with the TD algorithm, which is that although it can

learn to bridge temporal gaps, it requires some suitable

representation to support this bridging. We will see in

chapters 9 and 11 that this and the other problems can

be resolved by allowing the TD system to control the

updating of context-like representations.

, !

Learning to solve tasks having temporally extended

sequential contingencies requires the proper develop-

ment, maintenance and updating of context represen-

tations that specify a location within the sequence. A

simple recurrent network (SRN) enables sequential

learning tasks to be solved by copying the hidden layer

activations from the previous time step into a context

layer . The specialized context maintenance abilities of

the prefrontal cortex may play the role of the context

layerinanSRN.AnSRNcanlearna finite state au-

tomaton task by developing an internal representation

of the underlying node states.

The mathematical framework of reinforcement

learning can be used for learning with temporally de-

layed contingency information. The temporal differ-

ences (TD) reinforcement learning algorithm provides

a good fit to the neural firing properties of neurons in

the VTA . These neurons secrete the neuromodulator

dopamine to the frontal cortex, and dopamine has been

shown to modulate learning. The TD algorithm is based

on minimizing differences in expectations of future re-

ward values, and can be implemented using the same

phases as in the GeneRec algorithm. Various condi-

tioning phenomena can be modeled using the TD al-

gorithm, including acquisition, extinction ,and second-

order conditioning.

To stop now, quit by selecting Object/Quit in the

PDP++Root window.

, !

6.8

Summary

Combined Model and Task Learning

There are sound functional reasons to believe that both

Hebbian model learning and error-driven task learning

are taking place in the cortex. As we will see in later

chapters, both types of learning are required to account

for the full range of cognitive phenomena considered.

Computationally, Hebbian learning acts locally ,andis

autonomous and reliable ,butalso myopic and greedy .

Error-driven learning is driven by remote error signals ,

and the units cooperate to solve tasks. However, it can

suffer from codependency and laziness . The result of

combining both types of learning is representations that

encode important statistical features of the activity pat-

terns they are exposed to, and also play a role in solving

the particular tasks the network must perform. Specific

advantages of the combined learning algorithm can be

seen in generalization tasks, and tasks that use a deep

network with many hidden layers.

6.9