Information Technology Reference
In-Depth Information
Given the explanation above about the network's
poor generalization, it should be clear why both Heb-
bian learning and kWTA inhibitory competition can im-
prove generalization performance. At the most gen-
eral level, they constitute additional
biases
that place
important constraints on learning and the development
of representations. More specifically, Hebbian learn-
ing constrains the weights to represent the correlational
structure of the inputs to a given unit, producing sys-
tematic weight patterns (e.g., cleanly separated
clusters
of strong correlations; chapter 4).
Inhibitory competition helps in two ways. First, it
encourages individual units to specialize on represent-
ing a subset of items, thus parceling up the task in a
much cleaner and more systematic way than would oc-
cur in an otherwise unconstrained network. Second, as
discussed in chapter 3 (section 3.6.3), inhibitory com-
petition restricts the settling dynamics of the network,
greatly constraining the number of states that the net-
work can settle into, and thus eliminating a large pro-
portion of the attractors that can hijack generalization.
Let's see if we can improve the network's gener-
alization performance by adding some Hebbian learn-
ing. We cannot easily test the effects of kWTA inhi-
bition in our simulation framework, because removing
inhibitory competition necessitates other problematic
compensatory manipulations such as the use of posi-
tive/negative valued weights; however, clear advantages
for inhibitory competition in generalization have been
demonstrated elsewhere (O'Reilly, in press).
fects of Hebbian learning in representing the correla-
tional structure of the input.
You should have seen a substantial improvement
from adding the Hebbian learning. Now, let's see how
well pure Hebbian learning does on this task.
Set
learn_rule
to
PURE_HEBB
and
Apply
.
This
changes
lrn.hebb
to
1,
and
sets
,
!
lrn.bias_lrate
to 0.
Do a
Run
.
You will probably notice that the network learns quite
rapidly. The network will frequently get perfect perfor-
mance on the task itself, and on the generalization test.
However, every 5 or so runs, the network fails to learn
perfectly, and, due to the inability of Hebbian learning
to correct such errors, never gets better.
,
!
To find such a case more rapidly, you can press
Stop
when the network has already learned perfectly
(so you don't have to wait until it finishes the prescribed
number of epochs), and then press
Run
again.
Because this task has such obvious correlational
structure that is well suited for the Hebbian learning al-
gorithm, it is clear why Hebbian learning helps. How-
ever, even here, the network is more reliable if error-
driven learning is also used. We will see in the next
task that Hebbian learning helps even when the corre-
lational structure is not particularly obvious, but that
pure Hebbian learning is completely incapable of learn-
ing. These lessons apply across a range of different
generalization tests, including ones that rely on more
holistic similarity-based generalization, as compared to
the compositional (feature-based) domain explored here
(O'Reilly, in press, 1996b).
,
!
Set
learn_rule
on
the
control
panel
to
,
!
HEBB_AND_ERR
, and
Apply
.
You should see a
lrn.hebb
value of .05 in the con-
trol panel now.
Go to the
PDP++Root
window. To continue on to
the next simulation, close this project first by selecting
.projects/Remove/Project_0
. Or, if you wish to
stop now, quit by selecting
Object/Quit
.
Do a
Run
in the control panel (you may want to do
a
View
of the
TRAIN_GRAPH_LOG
if that window was
iconified or is otherwise no longer visible). After this, do
a
Batch
run to collect more data.
,
!
,
!
6.4
Learning to Re-represent in Deep Networks
Question 6.2 (a)
How did this .05 of additional Heb-
bian learning change the results compared to purely
error-driven learning?
(b)
Report the results from the
batch text log (
Batch_1_Textlog
) for the batch run.
(c)
Explain these results in terms of the weight pat-
terns, the unique pattern statistic, and the general ef-
One of the other critical benefits of combined model
and task learning is in the training of deep networks
(with many hidden layers). As we explained previously,
additional hidden layers enable the re-representation of
problems in ways that make them easier to solve. This
Search WWH ::
Custom Search