Information Technology Reference
In-Depth Information
appear to hold in the patients, however. The model im-
plementing the sensory-functional idea demonstrated
that the kinds of mutual support dynamics that emerge
from bidirectionally connected neural networks (chap-
ter 3) can account for this effect: after the functional se-
mantics associated with living things lost their mutual
support from the previously stronger sensory seman-
tics representations, they became much more difficult
to activate and therefore showed an impairment, even
though the functional representations themselves were
intact. In summary, the Farah and McClelland (1991)
model showed how distributed semantic representations
can exhibit counterintuitive dynamics under damage, in
a manner consistent with the observed neuropsycholog-
ical data.
The Farah and McClelland (1991) model is consis-
tent with all the basic principles of our framework, but
it does not address the question of how semantic rep-
resentations develop from experience in the first place,
and it cannot be easily probed using our own common-
sense intuitions about semantics because it used random
semantic patterns (much like we did above in the past-
tense model). Therefore, we instead explore a model
that implements just one piece of the larger distributed
semantic network but allows us to see how semantic
representations can develop in the first place, and to ex-
plore the properties of these representations using our
own familiar intuitions about semantic meaning.
The semantic representations in our model emerge
from simply accumulating information about which
words tend to occur together in speech or reading. This
model is based on the ideas of Landauer and Dumais
(1997), who have developed a method they call latent
semantic analysis or LSA that is based on computing
the principal components (using a variant of PCA; see
chapter 4) of word co-occurrence statistics. They have
shown that just this co-occurrence information can yield
semantic representations that do a surprisingly good job
at mimicking human semantic judgments.
For example, an LSA system trained on the text of a
psychology textbook was able to get a passing grade on
a multiple choice exam based on the text. Although the
performance (roughly 65-70% correct) was well short
of that of a good student, it is nonetheless surprising that
a simple automated procedure can perform as well as it
does. LSA has also been used to perform automated
essay grading, by comparing the semantic representa-
tion of a student's essay with those of various reference
essays that have received different human grades. The
correlation between grades assigned by LSA and those
of human graders was the same as that between differ-
ent human graders.
The word co-occurrence approach captures word as-
sociation semantics, not just definitional semantics. For
example, the definition of the word “bread” has nothing
to do with the word “butter,” but these two words are
highly linked associationally. It appears that these asso-
ciational links are important for capturing the structure
of human semantic memory. For example, in semantic
priming studies, the word “butter” is read faster when
preceded by the word “bread” than when preceded by
an unrelated word (e.g., “locomotive”).
As you might expect from chapter 4, Hebbian model
learning provides a natural mechanism for learning
word co-occurrence statistics. Indeed, CPCA Hebbian
learning is closely related to the the sequential prin-
cipal components analysis (SPCA) technique that the
LSA method is based on. Interestingly, the network and
training we use here are essentially identical to those
used in the Hebbian model of receptive field develop-
ment in the early visual system, as described in chap-
ter 8. In both of these models, the network extracts the
reliable correlations from naturalistic input stimuli us-
ing a simple Hebbian learning mechanism.
This type of learning requires a sufficiently large
sample of text for the extraction of useful statistics
about which words tend to co-occur. A particularly con-
venient source of such text is this textbook itself! So,
we trained a simple CPCA Hebbian network by having
it “read” an earlier draft of this textbook paragraph-by-
paragraph, causing it to represent the systematic pat-
terns of co-occurrence among the words. In the follow-
ing exploration, we then probe the resulting network to
see if it has captured important aspects of the informa-
tion presented in this text.
One of the most important themes that emerges from
this exploration, which is highly relevant for the larger
distributed semantics framework, is the power of dis-
tributed representations to capture the complexity of se-
mantic information. We will see that the distributed and
Search WWH ::




Custom Search