Information Technology Reference
In-Depth Information
pled understanding of why, this is becoming increas-
ingly rare as the field matures and standards improve.
Nevertheless, it is important to maintain a concerted fo-
cus on understanding models' performance in terms of
a set of principles that transcend particular implementa-
tions.
nacy problem of multiple, very different models that
are completely equally good, and thus, impossible to
choose between.
Further, learning, which shapes the weight param-
eters in the model, is not ad hoc and under the re-
searcher's precise control. Instead, learning is governed
by a well-understood set of principles that shape the
network's weights in interaction with the environment.
Thus, by understanding the principles by which a net-
work learns, the apparent complexity of the model is
reduced to the relatively simple application of a small
set of principles.
Importantly, the majority of the models in this topic
use the same set of standard parameters. The few excep-
tions are generally based on principled consideration,
not ad-hoc parameter fitting. For example, we varied
the parameter for the activity level of the hidden layers
between models. As we saw in chapter 9, this manip-
ulation maps onto known properties of different brain
regions, and has important implications for the trade-
off between learning specific information about individ-
ual patterns (sparse activations) versus integrating over
many patterns (more distributed activations).
Novice modelers often ask how to determine how
many units, layers, and the like should be used in a
model. We have emphasized with the Leabra algorithm
that performance does not depend very much on these
parameters, as long as there are enough units and lay-
ers (chapter 4). The network will generally behave very
much the same with excess units, though it may have
more redundancy. Adding excess layers usually slows
learning, but not nearly as much as in a backpropaga-
tion network. We think this robustness to excess de-
grees of freedom is essential for any plausible model
of the brain, which is clearly overparameterized relative
to the constraints of any given task. Thus, one basic
approach is to include as many units and layers as the
network appears to need to learn the task; and it really
should not matter (except in computational time) if you
have too many.
12.3.3
Models Can Do Anything
A number of challenges focus on the free parameters
in neural network models. For example, critics have
argued that with so many parameters, one can get these
models to learn anything, so it is uninteresting to show
that they do. And it is hard to know which parameters
were crucial for the learning. Further, multiple models
that differ greatly from one another may all successfully
simulate a particular phenomenon, making it even more
difficult to identify the critical mechanisms underlying
behavior (the indeterminacy problem).
One might be able to train a network to do any-
thing, perhaps even using multiple, very different mod-
els. However, many models are subjected to further
tests of untrained aspects of performance, such as gen-
eralization to new problems or response to damage. A
network's behavior when damaged, for example, is not
due to it being trained to behave this way. Instead,
the network was trained to perform correctly, and the
performance following damage emerged from the ba-
sic computational properties of the model (O'Reilly &
Farah, 1999). Many of the most psychologically inter-
esting aspects of neural network models are based on
such untrained aspects of performance.
Moreover, such tests, beyond what networks were
trained to do, may provide important constraints on re-
solving indeterminacy issues. Two very different net-
works may be equally good at modeling what they were
trained to do, but one may provide a better match to
untrained aspects of performance. Similarly, two very
different networks may appear to be equally faithful to
known properties of neurobiology, but one may provide
a better match to a more detailed model, or to subse-
quent discoveries. Thus, with the vast and growing col-
lection of top-down (behavioral) and bottom-up (bio-
logical) constraints on neural network models, it seems
increasingly unlikely that we will face the indetermi-
12.3.4
Models Are Reductionistic
How can a computational model tell us anything about
love, hate, free will, consciousness and everything else
Search WWH ::




Custom Search