Language - Computational Explorations in Cognitive Neuroscience

Information Technology Reference

In-Depth Information

words one has heard so far. Second, it makes it possible

for the network to actually achieve correct performance

most of the time, which is not possible with the original

method. This enables us to monitor its learning perfor-

mance over training more easily.

Third, with Leabra, there is no way to present ques-

tions about future information during training without

effectively exposing the network to that information.

Specifically, in Leabra errors are equivalent to activa-

tion states, and the entire activation state of the network

must be updated for proper error signals to be propa-

gated for each question. Thus, if all of the questions

were asked after each input, information about the en-

tire sentence would be propagated into the network (and

thus into the gestalt representation) via the plus phases

of the questions. To preserve the idea that the gestalt

is updated primarily from the inputs, we ask questions

only about current or previous inputs. The original SG

model instead used a somewhat complicated error prop-

agation mechanism, where all the errors for the ques-

tions were accumulated separately in the output end of

the network, and then passed back to the encoding por-

tion of the network after the entire sentence has been

processed. Thus, there was a dissociation between acti-

vation states based on the inputs received so far, and the

subsequent error signals based on all the questions, al-

lowing all of the questions to be asked after each input.

Training the network by asking it explicitly about

roles is only one of many different possible ways that

this information could be learned. For example, visual

processing of an event simultaneous with a verbal de-

scription of it would provide a means of training the

sentence-based encoding to reflect the actual state of

affairs in the environment. However, the explicit role-

filler training used here is simple and provides a clear

picture of how well the gestalt representation has en-

coded the appropriate information.

The network parameters were fairly standard for a

larger sized network, with 25 percent activity in the

encoding and decoding hidden layers, and 15 percent

activity in the gestalt hidden layer. The proportion of

Hebbian learning was .001, and the learning rate was

reduced from .01 to .001 after 200 epochs of train-

ing. The fm hid and fm prv parameters for updating

the context layer were set to the standard values of .7

and .3, which allows for some retention of prior states

but mostly copies the current hidden state.

10.7.2

Exploring the Model

Open the project sg.proj.gz in chapter_10 to

begin.

As usual, the network is in skeleton form and must

be built.

Do BuildNet on the sg_ctrl overall control panel

to build it.

Then, you can poke around the network and explore

the connectivity using the r.wt button, and then return

to viewing act .

Note that the input/output units are all labeled accord-

ingtothefirst two letters of the word, role, or concept

that they represent.

, !

Training

First, let's see exactly how the network is trained by

stepping through some training trials.

Open up a training log by doing View , TRAIN_LOG ,

and then open up a process control panel for training

by doing View , TRAIN_PROCESS_CTRL .Do ReInit .

There will be a delay while an entire epoch'sworth

of sentences (100 sentences) are randomly generated.

Then press Step (a similar delay will ensue, due to the

need to recreate these sentences at the start of every

epoch — because these happen at different levels of

processing, the redundancy is difficult to avoid).

You should see the first word of the first sentence pre-

sented to the network. Recall that as each word is pre-

sented, questions are asked about all current and previ-

ous information presented to the network. Because this

is the first word, the network has just performed a mi-

nus and plus phase update with this word as input, and

it tried to answer what the agent of the sentence is.

To understand how the words are presented, let's first

look at the training log ( Trial_0_TextLog ,seefig-

ure 10.28 for an example from the trained network).

The trial and EventGp columns change with each

different word of the sentence, with EventGp showing

the word that is currently being processed. Within the

presentation of each word, there are one or more events

, !

Computational Explorations in Cognitive Neuroscience

Search WWH ::

Custom Search

Home