Language - Computational Explorations in Cognitive Neuroscience

Information Technology Reference

In-Depth Information

WordMatrix Pattern: 0

a hidden layer representation that best fits this combi-

nation of words. Thus, novel semantic representations

can be produced as combinations of semantic represen-

tations for individual words. This ability is critical for

some of the more interesting and powerful applications

of these semantic representations (e.g., multiple choice

question answering, essay grading, etc.).

The ActProbe function can be used to do this

activation-based probing of the semantic representa-

tions.

binding

2.20

1.80

attention

1.40

1.00

0.60

dyslexia

Select act in the network window. Press

ActProbe on the sem_ctrl control panel, and you will

be prompted for two sets of words. Let's start with the

same example we have used before, entering “attention”

for the first word set, and “binding” for the second.

You should see the network activations updating as

the word inputs are presented. The result pops up in a

window, showing the cosine between the hidden acti-

vation patterns for the two sets of words. Notice that

this cosine is lower than that produced by the weight-

based analysis of the WordMatrix function. This can

happen due to the activation dynamics, which can ei-

ther magnify or minimize the differences present in the

weights.

Next, let'suse ActProbe to see how we can sway

an otherwise somewhat ambiguous term to be inter-

preted in a particular way. For example, the term “atten-

tion” can be used in two somewhat different contexts.

One context concerns the implementational aspects of

attention, most closely associated with “competition.”

Another context concerns the use of attention to solve

the binding problem, that is associated with “invari-

ant object recognition.” Let's begin this exploration by

first establishing the baseline association between “at-

tention” and “invariant object recognition.”

0.00

0.50

1.00

1.50

, !

Figure 10.25: Cluster plot of the similarity structure for at-

tention, binding, and dyslexia as produced by the WordMa-

trix function.

You should replicate the same basic effect we saw

above — attention and binding are more closely related

to each other than they are to dyslexia. This can be seen

in the cluster plot (figure 10.25) by noting that attention

and binding are clustered together. The cosine matrix

appears in the terminal window where you started the

program (it should look like table 10.12). Here, you can

see that the cosine between “attention” and “binding”

is .415, (relatively high), while that of “attention” and

“dyslexia” is only .090, and that between “binding” and

“dyslexia” is only .118.

Do a WordMatrix for several other words that the

network should know about from “reading” this textbook.

, !

Question 10.11 (a) Report the cluster plot and cosine

matrix results. (b) Comment on how well this matches

your intuitive semantics from having read this textbook

yourself.

Do an ActProbe with “attention” as the first word

set, and “invariant object recognition” as the second.

You should get a cosine of around .302. Now, let's

see if adding “binding” in addition to “attention” in-

creases the hidden layer similarity.

, !

Distributed Representations via Activity Patterns

To this point we have only used the patterns of weights

to the hidden units to determine how similar the seman-

tic representations of different words are. We can also

use the actual pattern of activation produced over the

hidden layer as a measure of semantic similarity. This is

important because it allows us to present multiple word

inputs at the same time, and have the network choose

Do an ActProbe with “attention binding” as the first

word set, and “invariant object recognition” again as the

second.

The similarity does indeed increase, producing a co-

sine of around .326. To make sure that there is an in-

Computational Explorations in Cognitive Neuroscience

Search WWH ::

Custom Search

Home