Language - Computational Explorations in Cognitive Neuroscience

Information Technology Reference

In-Depth Information

appear to hold in the patients, however. The model im-

plementing the sensory-functional idea demonstrated

that the kinds of mutual support dynamics that emerge

from bidirectionally connected neural networks (chap-

ter 3) can account for this effect: after the functional se-

mantics associated with living things lost their mutual

support from the previously stronger sensory seman-

tics representations, they became much more difficult

to activate and therefore showed an impairment, even

though the functional representations themselves were

intact. In summary, the Farah and McClelland (1991)

model showed how distributed semantic representations

can exhibit counterintuitive dynamics under damage, in

a manner consistent with the observed neuropsycholog-

ical data.

The Farah and McClelland (1991) model is consis-

tent with all the basic principles of our framework, but

it does not address the question of how semantic rep-

resentations develop from experience in the first place,

and it cannot be easily probed using our own common-

sense intuitions about semantics because it used random

semantic patterns (much like we did above in the past-

tense model). Therefore, we instead explore a model

that implements just one piece of the larger distributed

semantic network but allows us to see how semantic

representations can develop in the first place, and to ex-

plore the properties of these representations using our

own familiar intuitions about semantic meaning.

The semantic representations in our model emerge

from simply accumulating information about which

words tend to occur together in speech or reading. This

model is based on the ideas of Landauer and Dumais

(1997), who have developed a method they call latent

semantic analysis or LSA that is based on computing

the principal components (using a variant of PCA; see

chapter 4) of word co-occurrence statistics. They have

shown that just this co-occurrence information can yield

semantic representations that do a surprisingly good job

at mimicking human semantic judgments.

For example, an LSA system trained on the text of a

psychology textbook was able to get a passing grade on

a multiple choice exam based on the text. Although the

performance (roughly 65-70% correct) was well short

of that of a good student, it is nonetheless surprising that

a simple automated procedure can perform as well as it

does. LSA has also been used to perform automated

essay grading, by comparing the semantic representa-

tion of a student's essay with those of various reference

essays that have received different human grades. The

correlation between grades assigned by LSA and those

of human graders was the same as that between differ-

ent human graders.

The word co-occurrence approach captures word as-

sociation semantics, not just definitional semantics. For

example, the definition of the word “bread” has nothing

to do with the word “butter,” but these two words are

highly linked associationally. It appears that these asso-

ciational links are important for capturing the structure

of human semantic memory. For example, in semantic

priming studies, the word “butter” is read faster when

preceded by the word “bread” than when preceded by

an unrelated word (e.g., “locomotive”).

As you might expect from chapter 4, Hebbian model

learning provides a natural mechanism for learning

word co-occurrence statistics. Indeed, CPCA Hebbian

learning is closely related to the the sequential prin-

cipal components analysis (SPCA) technique that the

LSA method is based on. Interestingly, the network and

training we use here are essentially identical to those

used in the Hebbian model of receptive field develop-

ment in the early visual system, as described in chap-

ter 8. In both of these models, the network extracts the

reliable correlations from naturalistic input stimuli us-

ing a simple Hebbian learning mechanism.

This type of learning requires a sufficiently large

sample of text for the extraction of useful statistics

about which words tend to co-occur. A particularly con-

venient source of such text is this textbook itself! So,

we trained a simple CPCA Hebbian network by having

it “read” an earlier draft of this textbook paragraph-by-

paragraph, causing it to represent the systematic pat-

terns of co-occurrence among the words. In the follow-

ing exploration, we then probe the resulting network to

see if it has captured important aspects of the informa-

tion presented in this text.

One of the most important themes that emerges from

this exploration, which is highly relevant for the larger

distributed semantics framework, is the power of dis-

tributed representations to capture the complexity of se-

mantic information. We will see that the distributed and

Computational Explorations in Cognitive Neuroscience

Search WWH ::

Custom Search

Home