Information Technology Reference
In-Depth Information
instead of “went”). Finally, they learn to treat the irreg-
ulars as irregulars again.
Figure 10.18 shows the overregularization pattern for
one of the children (Adam) in the CHILDES database of
samples of children's speech (MacWhinney & Snow,
1990), which serves as the primary documentation of
this phenomenon. As this figure makes clear, the over-
regularization is not an all-or-nothing phenomenon, as
some have mistakenly assumed. Instead, it occurs spo-
radically, with individual verbs being inflected correctly
one moment and overregularized the next, and overall
rates of overregularization varying considerably at dif-
ferent points in time (Marcus, Pinker, Ullman, Hollan-
der, Rosen, & Xu, 1992). Nevertheless, at least in a
subset of cases, there is clearly an early correct period
before any overregularization occurs, and a mature pe-
riod of subsequent correct performance, with an inter-
vening period where overregularizations are made. We
will see that the model behaves in a similar fashion.
ney & Leinbach, 1991), none has been entirely success-
ful in capturing the essential properties of the U-shaped
curve directly in terms of the properties of the network
itself, without the need for introducing environmental
or other questionable manipulations that do most of the
work. For example, the Plunkett and Marchman (1993)
model is widely regarded as a fully satisfactory account,
but it has several limitations. First, it depends critically
on a manipulation of the training environment, which
starts out much as in the original Rumelhart and Mc-
Clelland (1986) model with a small number of high fre-
quency, mostly irregular verbs. Then, instead of adding
new verbs all at once, they continuously add them to the
training set. This continuous adding of new, mostly reg-
ular verbs triggers the overregularization in the network,
and does not change the basic problem that the network
itself is not driving the overregularization. Furthermore,
Hoeffner (1997) was unable to replicate their original
results using a more realistic corpus based on English,
as opposed to the artificial corpus used in the original
model.
One can distinguish two different levels of analysis
for understanding the origin of the past-tense U-shaped
developmental curve (O'Reilly & Hoeffner, in prepara-
tion):
Existing Neural Network Models
The overregularization phenomenon was originally in-
terpreted as the result of a rule-based system that gets
a bit overzealous early in language acquisition. Then,
Rumelhart and McClelland (1986) developed a neural
network model that showed a U-shaped overregulariza-
tion curve, and they argued that it did so because net-
works are sensitive to regularities in the input-output
mapping, and will have a tendency to overregularize.
However, Pinker and Prince (1988) pointed out several
problems with the Rumelhart and McClelland (1986)
model. Perhaps the most troubling was that most of the
U-shaped effect was apparently due to a questionable
manipulation in the training set. Specifically, a large
number of lower-frequency regular words were sud-
denly introduced after the network had learned a smaller
number of high-frequency words that were mostly ir-
regulars. Thus, this sudden onslaught of regular words
caused the network to start treating the irregular words
like regulars — overregularization.
Although a number of network models of past tense
learning have been developed since (Plunkett & March-
man, 1993, 1991; Hoeffner, 1997, 1992; Daugherty &
Seidenberg, 1992; Hare & Elman, 1992; MacWhin-
Mechanistic: What kinds of learning/processing
mechanisms naturally give rise to a U-shaped learn-
ing curve, and more specifically, do neural network
models naturally produce such curves, or do they
require environmental or “external” manipulations to
produce them?
Environmental: What are the actual statistical proper-
ties of the linguistic environment that surround this
U-shaped learning curve, and is it in fact reasonable
to explain this phenomenon largely in terms of these
statistics, in conjunction with a relatively generic
learning mechanism sensitive to such statistics, (e.g.,
a neural network).
It should be clear that explanations at either or both of
these levels of analysis could potentially account for the
observed U-shaped curve phenomenon, but this has not
generally been appreciated in the literature, leading to
arguments that confuse issues across these levels. Ex-
isting neural network models have tended to focus on
Search WWH ::




Custom Search