Information Technology Reference
In-Depth Information
bled during the period this topic was written). Thus, it
should be increasingly possible to implement large fine-
grained models to test whether the simplified, scaled-
down models provide a reasonable approximation to
more realistic implementations.
were analyzed in terms of more general principles that
apply to virtually any kind of statistical learning mech-
anism (which includes all commonly used neural net-
work learning algorithms) the models could be under-
stood in terms of these principles.
Another good example comes from Plaut et al.
(1996), who provided both an analytical and imple-
mented model of the effects of regularity and frequency
on learning and reaction time in reading.
Another major aspect of model complexity is the is-
sue of interpretability — a traditional complaint about
neural network models (typically backpropagation net-
works) is that they cannot be inspected after learning to
discover anything about what they have learned. That is,
network performance is uninterpretable (e.g., Young &
Burton, 1999). This issue has also been emphasized by
people who use neural networks for solving practical
problems.
However, we must accept that the brain is a complex
dynamic system and is not likely to be easy to reverse-
engineer. Thus, restricting oneself to overly simplistic
models that are easy to understand is not likely to be a
good approach (O'Reilly & Farah, 1999).
Further, as discussed in chapter 6, the standard back-
propagation algorithm is likely to be aberrantly opaque
to interpretation; interpretability may become less of an
issue with the development of additional algorithms and
constraints on learning. Backpropagation is typically
highly underconstrained in its learning, so the weights
do not tend to strongly align themselves with the rele-
vant aspects of the task. Hebbian model learning pro-
vides a generally useful bias that produces much more
constrained, easily interpretable weights. This advan-
tage of Hebbian constraints was exploited throughout
the text, for example, in the object recognition model
(chapter 8) and the reading and past-tense language
models (chapter 10).
The resulting models provide a nice balance between
computational power and interpretability. Hopefully,
such models will appeal to those who currently advo-
cate computationally weak localist models just because
they are more interpretable.
Thus, although a number of published models con-
stitute relatively unanalyzed and/or uninterpreted im-
plementations that show effects without a clear princi-
12.3.2
Models Are Too Complex
A common criticism of neural network models is that
they are not useful theoretical tools, because they are
much more complicated than a simple verbal theory
(McCloskey, 1991). A nice reply to this criticism has
been given by Seidenberg (1993), where he emphasizes
the distinction between descriptive and explanatory the-
ories. McCloskey's arguments depend on a descriptive
theoretical framework, where the goal is to describe a
set of phenomena with theoretical constructs that can
be relatively transparently mapped onto the phenomena
— these kinds of theories essentially provide a concise
and systematic description of a set of data.
The complexity of neural network models can some-
times make them unsuitable as purely descriptive the-
ories, which is the thrust of McCloskey's argument.
However, as we hope this topic has demonstrated, neu-
ral network models are very well suited for developing
explanatory theories, which explain a set of phenomena
in terms of a small set of deeper, independently moti-
vated principles. The implemented model serves as a
test of the sufficiency of these principles to account for
data, and is thus an essential tool in the development
and refinement of the theory.
For example, McClelland et al. (1995), leveraging
work on statistical learning by White (1989b), provided
a theoretical account of the tradeoff between rapid ar-
bitrary and slow integrative learning, and related this
tradeoff to the complementary roles of the hippocampus
and neocortex in learning and memory. The arguments
in this work are based on general principles, not partic-
ular implementations, and therefore are truly theoreti-
cal. Instantiations of these principles were then made to
demonstrate their applicability.
These instantiations required all the concomitant as-
sumptions, simplifications, and the like that McCloskey
(1991) argued irrevocably cloud the theoretical impor-
tance of network models. However, because the issues
Search WWH ::




Custom Search