Information Technology Reference
In-Depth Information
afaster
dt.hidden
parameter (.2 instead of .04 —
more on this below).
Go to the
PDP++Root
window. To continue on to
the next simulation, close this project first by selecting
.projects/Remove/Project_0
. Or, if you wish to
stop now, quit by selecting
Object/Quit
.
,
!
Now select
KWTA_AVG_INHIB
for the
inhib_type
to use the average-based kWTA function (
Apply
), and
Run
again.
You should observe that the hidden layer activation
stabilizes on the target activation level of 15 percent.
,
!
3.5.5
Digits Revisited with kWTA Inhibition
To cap off our explorations of inhibition, we return to
the digits example and revisit some of the issues we
originally explored without the benefits of inhibition.
This will give you a better sense of how inhibition,
specifically the basic kWTA function, performs in a
case where the representations are more than just ran-
dom activation patterns.
To test the set point behavior of the kWTA functions,
run the network with
input_pct
levels of 10 and 30 (do
not forget to hit
NewInput
) in addition to the standard 20
(you can do this for both types of kWTA function).
Notice that these functions exhibit stronger set point
behavior than the inhibitory unit based inhibition (with
the average-based kWTA showing just slightly more
variability in overall activity level). This is because
the kWTA functions are designed explicitly to have a
set point, whereas the inhibitory units only roughly pro-
duce set-point behavior. Thus, we must always remem-
ber that the kWTA functions are merely an
idealized
approximation
of the effects of inhibitory neurons, and
do not behave in an identical fashion.
Next we will see one of the main advantages of the
kWTA functions.
,
!
Open the project
inhib_digits.proj.gz
in
chapter_3
to begin.
This is essentially identical to the
loc_dist.proj.gz
project you explored in
section 3.3.3, except that basic kWTA inhibition is in
effect, and is controlled by the
hidden_k
parameter
that specifies the maximum number of hidden units
that can be strongly active. The bias weights have also
been turned off by default for the localist network, for
reasons that will become clear below.
,
!
Set the
input_pct
back to the default 20 (and
NewInput
). Set
inhib_type
to
KWTA_INHIB
and try
to find the fastest update parameter
dt.hidden
(in in-
crements of .1, to a maximum of 1) that does not result
in significant oscillatory behavior.
View
the
GRID_LOG
, then
Run
with the default
hidden_k
parameter of 1.
You should get the expected localist result that a sin-
gle hidden unit is strongly active for each input. Note
that the leak current
g_bar_l
is relatively weak at 1.5,
so it is not contributing to the selection of the active unit
(you can prove this to yourself by setting it to 0 and re-
running — be sure to set it back to 1.5 if you do this).
,
!
Question 3.13 (a)
What was the highest value of
dt.hidden
that you found? How does this compare
with the value of this parameter for unit-based inhibi-
tion (.04)?
(b)
Why do you think kWTA can use such a
fast update rate where unit-based inhibition cannot?
Now, increase the
hidden_k
parameter to 2, and
Run
again.
Notice that in several cases, only 1 unit is strongly
activated. This is because the second and third (
k
and
k+1th
) units had identical excitatory net input values,
meaning that when the inhibition was placed right be-
tween them, it was placed right at their threshold lev-
els of inhibition. Thus, both of these units were just at
threshold (which results in the weak activation shown
due to the effects of the noise in the noisy XX1 activa-
tion function as described in section 2.5.4). This is like
the situation shown in figure 3.24b, except that the most
active unit is well above the
k
and
k+1th
ones.
,
!
Return the
dt.hidden
parameter to .2 before con-
tinuing.
For the
k-or-less
property of the basic kWTA function
to apply, you have to set a leak current value
g_bar_l
that prevents weak excitation from activating the units,
but allows strong excitation to produce activation.
To see this k-or-less property, increase
g_bar_l
(in .1 increments) to find a value that prevents excitation
from an
input_pct
of 10 or less from activating any of
the hidden units, but allows excitation of 20 or more to
activate both layers.
Search WWH ::
Custom Search