Error-Driven Task Learning - Computational Explorations in Cognitive Neuroscience

Information Technology Reference

In-Depth Information

lrn field, which shows the learning rate for the

weights ( lrate , always .01) and for the bias weights

( bias_lrate , which is 0 for Hebbian learning be-

cause it has no way of training the bias weights, and is

equal to lrate for delta rule), and the proportion of

Hebbian learning ( hebb ,1or0— we will see in the

next chapter that intermediate values of this parameter

can be used as well).

Before training the network, we will explore how the

minus-plus activation phases work in the simulator.

Event_2

Event_3

Event_0

Event_1

Figure 5.7: The impossible pattern associator task mapping,

where there is complete overlap among the patterns that acti-

vate the different output units.

Make sure that you are monitoring act ivations in

the network, and set step_level to STEP_SETTLE in-

stead of STEP_TRIAL in the control panel.

This will increase the resolution of the stepping so

that each press of the Step button will perform the set-

tling (iterative activation updating) process associated

with each phase of processing.

, !

This variability in the weights reflects a critical weak-

ness of error-driven learning — it's lazy . Basically, once

the output unit is performing the task correctly, learn-

ing effectively stops, with whatever weight values that

happened to do the trick. In contrast, Hebbian learn-

ing keeps adapting the weights to reflect the conditional

probabilities, which, in this task, results in roughly the

same final weight values regardless of what the initial

random weights were. We will return to this issue later,

when we discuss the benefits of using a combination of

Hebbian and error-driven learning.

Now for the real test.

Next hit the Step button.

You will see in the network the actual activation pro-

duced in response to the input pattern (also known as the

expectation, or response ,or minus phase activation).

, !

Now, hit Step again.

You will see the target (also known as the outcome,

or instruction, or plus phase ) activation. Learning oc-

curs after this second, plus phase of activation. You can

recognize targets because their activations are exactly

.95 or 0 — note that we are clamping activations to .95

and 0 because units cannot easily produce activations

above .95 with typical net input values due to the satu-

rating nonlinearity of the rate code activation function.

You can also switch to viewing the targ in the net-

work, which will show you the target inputs prior to the

activation clamping. In addition, the minus phase acti-

vation is always viewable as act_m and the plus phase

as act_p .

Now, let's monitor the weights.

, !

Set env_type to HARD . Then, press Run .

You should see that the network learns this task with-

out apparently much difficulty. Thus, because the delta

rule performs learning as a function of how well the net-

work is actually doing, it can adapt the weights specifi-

cally to solve the task.

, !

Question 5.3 (a) Compare and contrast in a qualita-

tive manner the nature of the weights learned by the

delta rule on this HARD task with those learned by the

Hebbian rule (e.g., note where the largest weights tend

to be) — be sure to do multiple runs to get a general

sense of what tends to be learned. (b) Using your an-

swer to the first part, explain why the delta rule weights

solve the problem, but the Hebbian ones do not (don't

forget to include the bias weights bias.wt in your

analysis of the delta rule case).

Click on r.wt , and then on the left output unit. Then

Run the process control panel to complete the training

on this EASY task.

The network has no trouble learning this task. How-

ever, if you perform multiple Run 's, you might no-

tice that the final weight values are quite variable rel-

ative to the Hebbian case (you can always switch the

LearnRule back to HEBB in the control panel to com-

pare between the two learning algorithms).

, !

After this experience, you may think that the delta

rule is all powerful, but we can temper this enthusiasm

and motivate the next section.

Computational Explorations in Cognitive Neuroscience

Search WWH ::

Custom Search

Home