Combined Model and Task Learning, and Other Mechanisms - Computational Explorations in Cognitive Neuroscience

Information Technology Reference

In-Depth Information

training. The 15 epochs amounts to only 175 different

sequences and the 54 epochs amounts to 650 sequences

(each set of 25 sequences lasts for 2 epochs).

In either case, the Leabra network is much faster

than the backpropagation network used by (Cleeremans

et al., 1989), which took 60,000 sequences (i.e., 4,800

epochs under our scheme). However, we were able to

train backpropagation networks with larger hidden lay-

ers (30 units instead of 3) to learn in between 136 and

406 epochs. Thus, there is some evidence of an advan-

tage for the additional constraints of model learning and

inhibitory competition in this task, given that the Leabra

networks generally learned much faster (and backprop-

agation required a much larger learning rate).

Now we can test the trained network to see how it has

solved the problem, and also to see how well it distin-

guishes grammatical from ungrammatical letter strings.

MonitorEnv Pattern: 0

22.00

P->X_3->2

20.00

V->E_5->0

V->P_4->3

18.00

V->V_4->5

V->P_4->3

16.00

T->V_2->4

T->T_2->2

14.00

T->V_2->4

T->T_2->2

12.00

T->T_2->2

T->S_1->1

10.00

S->X_1->3

S->S_1->1

8.00

S->S_1->1

6.00

S->S_1->1

X->X_3->2

4.00

X->T_2->2

2.00

X->V_2->4

B->T_0->1

0.00

Do View , TEST_GRID_LOG to open a log to display

the test results. Then, do Test .

This will test the network with one sequence of let-

ters, with the results shown in the grid log on the right.

Note that the network display is being updated every

cycle, so you can see the stochastic choosing of one

of the two possible outputs. The network should be

producing the correct outputs, as indicated both by the

fsa_err column and by the fact that the Output pat-

tern matches the Target pattern, though it might make

an occasional mistake due to the noise.

To better understand the hidden unit representations,

we need a sequence of reasonable length (i.e., more than

ten or so events). In these longer sequences, the FSA

has revisited various nodes due to selecting the looping

path, and this revisiting will tell us about the represen-

tation of the individual nodes. Thus, if the total num-

ber of events in the sequence was below ten (events are

counted in the tick column of the grid log), we need

to keep Test ing to find a suitable sequence.

0.00

5.00

10.00

15.00

, !

Figure 6.15: Cluster plot of the FSA hidden unit represen-

tations for a long sequence. The labels for each node de-

scribe the current and next letter and the current and next node

(which the network is trying to predict). For example, T ! V

indicates that T was the letter input when the hidden state was

measured for the cluster plot, and the subsequent letter (which

does not affect the cluster plot) was V. Similarly, the asso-

ciated 2 ! 4 indicates that the node was 2 when the hidden

state was measured for the cluster plot, and the subsequent

node (which does not affect the cluster plot) was 4. The cur-

rent letter and node are relevant to evaluating the cluster plot,

whereas the next letter and node indicate what the network

was trained to predict. The letters are ambiguous (appearing

in multiple places in the grammar), but the nodes are not.

Question 6.4 Interpret the cluster plot you obtained

(especially the clusters with events at zero distance) in

terms of the correspondence between hidden states and

the current node versus the current letter. Remember

that current node and current letter information is re-

flected in the letter and number before the arrow.

To do so, turn the network Display toggle off (to

speed things up), and press Test again until you find

a sequence with ten or more events. After running the

sequence with ten or more events, press the Cluster

button on the fsa_ctrl control panel.

This will bring up a cluster plot of the hidden unit

states for each event (e.g., figure 6.15). Figure 6.15 pro-

vides a decoding of the cluster plot elements.

, !

Now, switch the test_env from TRAIN_ENV to

RANDOM_ENV ( Apply ). Then Test again.

This produces a random sequence of letters. Obvi-

ously, the network is not capable of predicting which

, !

Computational Explorations in Cognitive Neuroscience

Search WWH ::

Custom Search

Home