Information Technology Reference
In-Depth Information
reflecting the fact that some lines are coded by multiple
units.
Another thing to notice in the weights shown in the
grid log (figure 4.13) is that some units are obviously
not selective for anything. These “loser” units (also
known as “dead” units) were never reliably activated
by any input feature, and thus did not experience much
learning. It is typically quite important to have such
units lying around, because self-organization requires
some “elbow room” during learning to sort out the allo-
cation of units to stable correlational features. Having
more hidden units also increases the chances of hav-
ing a large enough range of initial random selectivities
to seed the self-organization process. The consequence
is that you need to have more units than is minimally
necessary, and that you will often end up with leftovers
(plus the redundant units mentioned previously).
From a biological perspective, we know that the cor-
tex does not produce new neurons in adults, so we con-
clude that in general there is probably an excess of neu-
ral capacity relative to the demands of any given learn-
ing context. Thus, it is useful to have these leftover
and redundant units, because they constitute a reserve
that could presumably get activated if new features were
later presented to the network (e.g., diagonal lines). We
are much more suspicious of algorithms that require
precisely tuned quantities of hidden units to work prop-
erly (more on this later).
ually). Thus, there is a separate testing process which,
after each epoch of learning, tests the network on all 10
lines, records the resulting hidden unit activity patterns
(with the kWTA parameter set to 1, though this is not
critical due to the flexibility of the average-based kWTA
function), and then counts up the number of unique such
patterns.
The logic behind this measure is that if each line is
encoded by (at least) one distinct hidden unit, then this
will show up as a unique pattern. If, however, there are
units that encode two or more lines together (which is
not a good model of this environment), then this will not
result in a unique representation for these lines, and the
resulting measure will be lower. Thus, to the extent that
this statistic is less than 10, the internal model produced
by the network does not fully capture the underlying
independence of each line from the other lines. Note,
however, that the unique pattern statistic does not care if
multiple hidden units encode the same line (i.e., if there
is redundancy across different hidden units) — it only
cares that the same hidden unit not encode two different
lines.
You should have seen on this run that the model pro-
duced a perfect internal model according to this statis-
tic, which accords well with our analysis of the weight
patterns. To get a better sense of how well the network
learns in general, you can run a batch of 8 training runs
starting with a different set of random initial weights
each time.
Do View , BATCH_LOG to open up a text log to record
a summary of the training runs. Then press the Batch
button.
Instead of updating every 5 epochs, the weight dis-
play updates at the end of every training run, and the
graph log is not updated at all. Instead, after the 8 train-
ing runs, the batch text log window will show summary
statistics about the average, maximum, and minimum of
the unique pattern statistic. The last column contains a
count of the number of times that a “perfect 10” on the
unique pattern statistic was recorded. You should get a
perfect score for all 8 runs.
Unique Pattern Statistic
, !
Although looking at the weights is informative, we
could use a more concise measure of how well the net-
work's internal model matches the underlying structure
of the environment. We can plot one such measure in a
graph log as the network learns.
Do View on the control panel and select
TRAIN_GRAPH_LOG . Turn the network Display back
off, and Run again.
This log shows the results of a unique pattern statis-
tic ( UniquePatStat in simulator parlance, shown as
unq_pats in the log), which records the number of
unique hidden unit activity patterns that were produced
as a result of probing the network with all 10 different
types of horizontal and vertical lines (presented individ-
, !
Search WWH ::




Custom Search