Combined Model and Task Learning, and Other Mechanisms - Computational Explorations in Cognitive Neuroscience

Information Technology Reference

In-Depth Information

Question 6.3 (a) What do you notice about the gen-

eral shape of the standard backpropagation (BP) learn-

ing curve (SSE over epochs) in figure 6.9 compared to

that of the PURE_ERR Leabra network you just ran?

Pay special attention to the first 30 or so epochs of

learning. (b) Given that one of the primary differ-

ences between these two cases is that the PURE_ERR

network has inhibitory competition via the kWTA func-

tion, whereas BP does not, speculate about the possible

importance of this competition for learning based on

these results (also note that the BP network has a much

larger learning rate, .39 vs .01). (c) Now, compare the

PURE_ERR case with the original HEBB_AND_ERR

case (i.e., where do the SSE learning curves (red lines)

start to diverge, and how is this different from the BP

case)? (d) What does this suggest about the role of

Hebbian learning? (Hint: Error signals get smaller as

the network has learned more.)

100−

Pure Hebb

90−

80−

70−

60−

50−

Pure Err

40−

30−

20−

10−

Hebb & Err

0−

0 102030405060708090100

Epochs

60−

Pure Err

55−

50−

Pure Hebb

45−

40−

35−

Hebb & Err

30−

25−

0 102030405060708090100

Epochs

Figure 6.10: Graph log display showing both errors (a) And

cycles (b) Over epochs of training for the three different net-

work learning parameters: pure error driven learning (Pure

Err), pure Hebbian learning (Pure Hebb) and combined Heb-

bian and error-driven learning (Hebb & Err), which performs

the best.

To get a sense of how learning has shaped the trans-

formations performed by this network to emphasize rel-

evant similarities, we can do a cluster plot of the hid-

den unit activity patterns over all the inputs. Let'sdoa

comparison between the initial clusters and those after

learning for the default network.

First, press ReInit on the overall control panel to

reinitialize the weights. Then, press Cluster .

After a bit (it tests all 100 patterns, so be patient), a

cluster plot window will appear. We will compare this

cluster plot to one for the trained network.

see this by loading the graph log for a network trained

for 100 epochs.

, !

Select LogFile/Load File in the graph log, and

choose family_trees.pure_hebb.epc.log .

Although Hebbian model learning is useful for help-

ing error-driven learning, the graph shows that it is sim-

ply not capable of learning tasks like this on its own.

We next compare all three cases with each other.

, !

Go to the network window and select from the

menu at the upper left: Object/Load , and then se-

lect family_trees.hebb_err.00.041.net.gz (or

the network that you saved). Do Cluster again.

Your results should look something like figure 6.11.

There are many ways in which people who appear to-

gether can be justifiably related, so you may think there

is some sensibility to the initial plot. However, the fi-

nal plot has a much more sensible structure in terms of

the overall nationality difference coming out as the two

largest clusters, and individuals within a given genera-

tion tending to be grouped together within these overall

clusters. The network is able to solve the task by trans-

forming the patterns in this way.

, !

Load the family_trees.all.epc.log log file.

This log display has the three runs overlaid on each

other (figure 6.10). You can identify the lines based on

what epoch they end on (40 = HEBB_AND_ERR ,80=

PURE_ERR , and 100 = PURE_HEBB ). It is interesting

to note that the orange line (average settling cycles) is

fairly well correlated with the training error, and only

the combined Hebb and error network achieves a sig-

nificant speedup in cycles. You might also note that the

pure Hebb case starts out with very quick settling in the

first few epochs, and then slows down over training.

, !

Computational Explorations in Cognitive Neuroscience

Search WWH ::

Custom Search

Home