Combined Model and Task Learning, and Other Mechanisms - Computational Explorations in Cognitive Neuroscience - page 178

Information Technology Reference

In-Depth Information

You should see that the weights look relatively ran-

dom (figure 6.5) and clearly do not reflect the linear

structure of the underlying environment. To see how

much these weights change over learning from the truly

random initial weights, we can run again and watch the

weight log, which is updated every 5 epochs as before.

row

0

1

2

3

4

4

3

2

Press Run .

The generalization error measure, the hidden unit

weights, and the unique pattern statistic all provide con-

verging evidence for a coherent story about why gener-

alization is poor in a purely error-driven network. As we

said, generalization here depends on being able to re-

combine representations that systematically encode the

individual line elements independent of their specific

training contexts. In contrast, error-driven weights are

generally relatively underconstrained by learning tasks,

and thus reflect a large contribution from the initial ran-

dom values, rather than the kind of systematicity needed

for good generalization. This lack of constraint pre-

vents the units from systematically carving up the in-

put/output mapping into separable subsets that can be

independently combined for the novel testing items —

instead, each unit participates haphazardly in many dif-

ferent aspects of the mapping. The attractor dynamics

in the network then impair generalization performance.

Thus, the poor generalization arises due to the effects of

the partially-random weights on the attractor dynamics

of the network, preventing it from combining novel line

patterns in a systematic fashion.

To determine how representative this particular result

is, we can run a batch of 5 training runs. To record the

results of these training runs, we need to open up a few

logs.

, !

1

0

Figure 6.5: Final weights after training with pure error-

driven learning. Note how random they look compared to

the weights learned when Hebbian learning is used (cf. fig-

ure 4.13).

each unit), so that each unit has to have its activation on

the right side of .5 for the event not to be counted in this

measure. This is plotted in the red line in the graph log,

and the simulator labels it as cnt_sum_se in the log.

One of the test statistics, plotted in green, mea-

sures the generalization performance of the network

( gen_cnt ). The green line plots this generalization

performance in terms of the number of testing events

that the network gets wrong (out of the 10 testing items),

so the smaller this value, the better the generalization

performance. This network appears to be quite bad at

generalizing, with 9 of the 10 novel testing patterns hav-

ing errors.

The other test statistic, plotted in yellow

( unq_pats ), is the same unique pattern statistic

as used before (section 4.8.1), which measures the

extent to which the hidden units represent the lines dis-

tinctly (from 0 meaning no lines distinctly represented

to 10 meaning all lines distinctly represented). This

unique pattern statistic shows that the hidden units do

not uniquely represent all of the lines distinctly, though

this statistic does not seem nearly as bad as either the

generalization error or the weights that we consider

next.

Do View , TRAIN_TEXT_LOG , and View ,

BATCH_TEXT_LOG , and then press the Batch button on

the control panel.

The batch text log will present summary statistics

from the 5 training runs, and the train text log shows

the final results after each training run.

, !

Question 6.1 Report the summary statistics from the

batch text log ( Batch_1_Textlog for your batch

run. Does this indicate that your earlier observations

were generally applicable?

Do View , WT_MAT_LOG in the control panel, to dis-

play the weight values for each of the hidden units.

, !

Next Page

Computational Explorations in Cognitive Neuroscience

Search WWH ::

Custom Search

Home