Error-Driven Task Learning - Computational Explorations in Cognitive Neuroscience

Information Technology Reference

In-Depth Information

cannot learn to be sensitive to which inputs are more

task relevant than others (unless this happens to be the

same as the input-output correlations, as in the easy

task). This hard task has a complicated pattern of over-

lap among the different input patterns. For the two

cases where the left output should be on, the middle

two input units are very strongly correlated with the

output activity (conditional probability P (x i jy j )=1 ),

while the outside two inputs are only half-correlated

Question 5.2 (a) Does the network ever solve the task?

(b) Report the final sum_se at the end of training for

each run.

Experiment with the parameters that control the con-

trast enhancement of the CPCA Hebbian learning rule

( wt_gain and wt_off ), to see if these are playing an

important role in the network's behavior.

You should see that changes to these parameters do

not lead to any substantial improvements. Hebbian

learning does not seem to be able to solve tasks where

the correlations do not provide the appropriate weight

values. It seems unlikely that there will generally be a

coincidence between correlational structure and the task

solution. Thus, we must conclude that Hebbian learning

is of limited use for task learning. In contrast, we will

see in the next section that an algorithm specifically de-

signed for task learning can learn this task without much

difficulty.

, !

. The two cases where the left out-

put should be off (and the right one on) overlap consid-

erably with those where it should be on, with the last

event containing both of the highly correlated inputs.

Thus, if the network just pays attention to correlations,

it will tend to respond to this last case when it shouldn't.

Let's see what happens when we run the network on

this task.

Press the Run button in the pat_assoc_ctrl ,

which does a New Init to produce a new set of ran-

dom starting weights, and then does a Run . You should

be viewing the weights of the left output unit in the net-

work window, with the Display turned on so you can

see them being updated as the network learns.

You should see from these weights that the network

has learned that the middle two units are highly corre-

lated with the left output unit, as we expected.

, !

To continue on to the next simulation, you can leave

this project open because we will use it again. Or, if you

wish to stop now, quit by selecting Object/Quit in the

PDP++Root window.

, !

5.3

Using Error to Learn: The Delta Rule

Do TestStep 4times.

You should see that the network is not getting the

right answers. Different runs will produce slightly dif-

ferent results, but the middle two events should turn the

right output unit on, while the first and last either turn on

the left output, or produce weaker activation across both

output units (i.e., they are relatively equally excited).

The weights for the right output unit show that it has

strongly represented its correlation with the second in-

put unit, which explains the pattern of output responses.

This weight for the right output unit is stronger than

those for the left output unit from the two middle in-

puts because of the different overall activity levels in the

different input patterns — this difference in affects

the renormalization correction for the CPCA Hebbian

learning rule as described earlier (note that even if this

renormalization is set to a constant across the different

events, the network still fails to learn).

In this section we develop a task-based learning algo-

rithm from first principles, and continue to refine this

algorithm in the remainder of this chapter. In the next

chapter, we will compare this new task-based learn-

ing mechanism with Hebbian learning, and provide a

framework for understanding their relative advantages

and disadvantages.

An obvious objective for task learning is to adapt the

weights to produce the correct output pattern for each

input pattern. To do this, we need a measure of how

closely our network is producing the correct outputs,

and then some way of improving this measure by ad-

justing the weights. We can use the summed squared er-

ror (SSE) statistic described previously to measure how

close to correct the network is. First, we will want to

extend this measure to the sum of SSE over all events,

indexed by t , resulting in:

, !

(5.2)

Do several more Run sonthis HARD task.

SSE =

( t

, !

Computational Explorations in Cognitive Neuroscience

Search WWH ::

Custom Search

Home