Information Technology Reference
In-Depth Information
the particular type of line for any meaningful correla-
tions to be extracted. We will see that this conditional-
ity will simply self-organize through the interactions of
the learning rule and the kWTA inhibitory competition.
Note also that because two lines are present in every
image, the network will require at least two active hid-
den units per input, assuming each unit is representing
a particular line.
competition as follows. Early on, the units that won
the inhibitory competition were those that happened to
have larger random weights for the input pattern. CPCA
learning then tuned these weights to be more selective
for that input pattern, causing them to be more likely to
respond to that pattern and others that overlap with it
(i.e., other patterns sharing one of the two lines). To the
extent that the weights are stronger for one of the two
lines in the input, the unit will be more likely to respond
to inputs having this line, and thus the conditional prob-
ability for the input units in this line will be stronger
than for the other units, and the weights will continue
to increase. This is where the contrast enhancement
bias plays an important role, because it emphasizes the
strongest of the unit's correlations and deemphasizes
the weaker ones. This will make it much more likely
that the strongest correlation in the environment — the
single lines — end up getting represented.
You might have noticed in the weights displayed in
the grid log during learning that some units initially
seemed to be becoming selective for multiple lines, but
then as other units were better able to represent one of
those lines, they started to lose that competition and fall
back to representing only one line. Thus, the dynamics
of the inhibitory competition are critical for the self-
organizing effect, and it should be clear that a firm in-
hibitory constraint is important for this kind of learning
(otherwise units will just end up being active a lot, and
representing a mish-mash of line features). Neverthe-
less, the average-based kWTA function is sufficiently
flexible that it can allow more than two units to become
active, so you will probably see that sometimes multiple
hidden units end up encoding the same line feature.
The net result of this self-organizing learning is a nice
combinatorial
distributed representation, where each
input pattern is represented as the combination of the
two line features present therein. This is the “obvi-
ous” way to represent such inputs, but you should ap-
preciate that the network nevertheless had to discover
this representation through the somewhat complex self-
organizing learning procedure.
Iconify the environment window when you are
done examining the patterns, and return to viewing
act
ivations in the network window. Now hit
Step
in the
control panel to present a single pattern to the network.
You should see one of the event patterns containing
two lines in the input of the network, and a pattern of
roughly two active hidden units.
The hidden layer is using the average-based kWTA
inhibition function, with the
k
parameter set to 2 as you
can see in the
hidden_kwta_k
parameter in the con-
trol panel. This function allows for some variability
of actual activation level depending on the actual dis-
tribution of excitation across units in the hidden layer.
Thus, when more than two units are active, these units
are being fairly equally activated by the input pattern
due to the random initial weights not being very selec-
tive. This is an important effect, because these weaker
additional activations may enable these units to boot-
strap into stronger activations through gradual learning,
should they end up being reliably active in conjunction
with a particular input feature (i.e., a particular line in
this case).
,
!
Yo u c a n
Step
some more. When you tire of single
stepping, just press the
Run
button on the process con-
trol panel. You will want to turn off the
Display
in the
network, to make things run faster.
After 30
epochs
(passes through all 45 different
events in the environment) of learning, the network will
stop. You should have noticed that the weights grid log
was updated after every 5 epochs, and that the weights
came to more and more clearly reflect the lines present
in the environment (figure 4.13). Thus, individual units
developed
selective
representations of the correlations
present within individual lines, while ignoring the ran-
dom context of the other lines.
These individual line representations developed as a
result of the interaction between learning and inhibitory
,
!
To see this representation in action, turn the network
Display
back on, and
Step
through a few more events.
Notice that in general two or more units are strongly
activated by each input pattern, with the extra activation
,
!
Search WWH ::
Custom Search