Information Technology Reference
In-Depth Information
Given the explanation above about the network's
poor generalization, it should be clear why both Heb-
bian learning and kWTA inhibitory competition can im-
prove generalization performance. At the most gen-
eral level, they constitute additional biases that place
important constraints on learning and the development
of representations. More specifically, Hebbian learn-
ing constrains the weights to represent the correlational
structure of the inputs to a given unit, producing sys-
tematic weight patterns (e.g., cleanly separated clusters
of strong correlations; chapter 4).
Inhibitory competition helps in two ways. First, it
encourages individual units to specialize on represent-
ing a subset of items, thus parceling up the task in a
much cleaner and more systematic way than would oc-
cur in an otherwise unconstrained network. Second, as
discussed in chapter 3 (section 3.6.3), inhibitory com-
petition restricts the settling dynamics of the network,
greatly constraining the number of states that the net-
work can settle into, and thus eliminating a large pro-
portion of the attractors that can hijack generalization.
Let's see if we can improve the network's gener-
alization performance by adding some Hebbian learn-
ing. We cannot easily test the effects of kWTA inhi-
bition in our simulation framework, because removing
inhibitory competition necessitates other problematic
compensatory manipulations such as the use of posi-
tive/negative valued weights; however, clear advantages
for inhibitory competition in generalization have been
demonstrated elsewhere (O'Reilly, in press).
fects of Hebbian learning in representing the correla-
tional structure of the input.
You should have seen a substantial improvement
from adding the Hebbian learning. Now, let's see how
well pure Hebbian learning does on this task.
Set learn_rule to PURE_HEBB and Apply .
This
changes lrn.hebb to
1,
and
sets
, !
lrn.bias_lrate to 0.
Do a Run .
You will probably notice that the network learns quite
rapidly. The network will frequently get perfect perfor-
mance on the task itself, and on the generalization test.
However, every 5 or so runs, the network fails to learn
perfectly, and, due to the inability of Hebbian learning
to correct such errors, never gets better.
, !
To find such a case more rapidly, you can press
Stop when the network has already learned perfectly
(so you don't have to wait until it finishes the prescribed
number of epochs), and then press Run again.
Because this task has such obvious correlational
structure that is well suited for the Hebbian learning al-
gorithm, it is clear why Hebbian learning helps. How-
ever, even here, the network is more reliable if error-
driven learning is also used. We will see in the next
task that Hebbian learning helps even when the corre-
lational structure is not particularly obvious, but that
pure Hebbian learning is completely incapable of learn-
ing. These lessons apply across a range of different
generalization tests, including ones that rely on more
holistic similarity-based generalization, as compared to
the compositional (feature-based) domain explored here
(O'Reilly, in press, 1996b).
, !
Set learn_rule on
the
control
panel
to
, !
HEBB_AND_ERR , and Apply .
You should see a lrn.hebb value of .05 in the con-
trol panel now.
Go to the PDP++Root window. To continue on to
the next simulation, close this project first by selecting
.projects/Remove/Project_0 . Or, if you wish to
stop now, quit by selecting Object/Quit .
Do a Run in the control panel (you may want to do
a View of the TRAIN_GRAPH_LOG if that window was
iconified or is otherwise no longer visible). After this, do
a Batch run to collect more data.
, !
, !
6.4
Learning to Re-represent in Deep Networks
Question 6.2 (a) How did this .05 of additional Heb-
bian learning change the results compared to purely
error-driven learning? (b) Report the results from the
batch text log ( Batch_1_Textlog ) for the batch run.
(c) Explain these results in terms of the weight pat-
terns, the unique pattern statistic, and the general ef-
One of the other critical benefits of combined model
and task learning is in the training of deep networks
(with many hidden layers). As we explained previously,
additional hidden layers enable the re-representation of
problems in ways that make them easier to solve. This
Search WWH ::




Custom Search