Information Technology Reference
In-Depth Information
tic representations in the brain involve the entirety of
the associations between language representations and
those in the rest of the cortex, and are thus complex and
multifaceted. Language input may shape semantic rep-
resentations by establishing co-occurrence relationships
among different words, such that words that co-occur
together are likely to be semantically related. Lan-
dauer and Dumais (1997) have shown that a Hebbian-
like PCA-based mechanism can develop useful seman-
tic representations from word co-occurrence in large
bodies of text, and that these representations appear to
capture common-sense relationships among words. We
explore a model of this idea using the CPCA Hebbian
learning developed in chapter 4.
Although we focus primarily on individual words,
language clearly involves higher levels of processing
as well — sequences of words must somehow be in-
tegrated over time to produce representations of the
meaning of larger-scale structures such as phrases, sen-
tences, and paragraphs. Similarly, complex internal rep-
resentations must be translated into a sequence of sim-
pler expressions during speech production or writing.
Thus, temporally extended sequential processing, as de-
veloped in chapter 6, is critical for understanding these
aspects of language. The specialized memory systems
of the hippocampus and frontal cortex as discussed in
chapter 9 are also likely to be important.
The temporally extended structures of language are
characterized by syntax , and we will see how net-
works can learn to process sentences constructed ac-
cording to simple syntactic rules. Behavioral data sug-
gest that, despite having regularities, natural language
syntax is highly case-specific, because the interpreta-
tion of a given sentence often depends on the specific
meanings of the words involved. Again, this is easy to
account for with the specialized, dedicated representa-
tions in a neural network. We explore this interaction
between semantics and syntax in a replication of the
sentence gestalt model of
bly recombine basic elements into novel configurations
(phonemes into words, words into sentences, sentences
into paragraphs, paragraphs into sections, etc.). The
meaning of these novel combinations emerges as a dis-
tributed entity existing over time, and is not localizable
to any specific piece. Similarly, distributed represen-
tations exist as novel combinations of more basic unit-
level pieces that convey meaning as an emergent prop-
erty of the whole. Indeed, we do not think this sim-
ilarity is accidental: language is the way we unpack
distributed representations in our brains and communi-
cate them to another person via a serial communications
channel, with the hope that a corresponding distributed
representation will be activated in the receiver's head.
10.2
The Biology and Basic Representations of
Language
Language involves a range of different cortical areas. To
situate the models that follow, we first identify some of
the main brain areas and their potential interactions. We
also discuss relevant aspects of the input/output modal-
ities of language. We assume that everyone is famil-
iar with the visual properties of words, and so focus on
the details of phonology. Most people are not explic-
itly familiar with these details, even though they obvi-
ously have extensive implicit familiarity with them (as
evidenced by their ability to produce and comprehend
speech).
10.2.1
Biology
The biological basis of language, specifically the
anatomical specializations of different cortical areas for
different language functions, is difficult to study for sev-
eral reasons. First, because nonhuman animals do not
have full-fledged language abilities, we cannot perform
invasive electrical recording studies of language func-
tion (and if other animals did have language function,
we probably wouldn't stick electrodes in their brains!).
Thus, we must rely on “natural experiments” that pro-
duce brain damage in humans (i.e., neuropsychology)
and on neuroimaging. Although some progress has
been made using neuropsychological and neuroimaging
methods to identify specific relationships between corti-
St.
John and McClelland
(1990).
We end this introduction with a broad perspective
on language as a metaphor for how distributed rep-
resentations can provide such a powerful mechanism
for encoding information. The fundamental source
of power in language comes from the ability to flexi-
Search WWH ::




Custom Search