Information Technology Reference
In-Depth Information
per layer increases when going from Layer 0 (4 + 2) to Layer 2 (16+8). Layer 3
contains 10 excitatory and 5 inhibitory feature cells. In addition, an input feature
array is present in all layers except the topmost one.
Most projections in the network are either excitatory or inhibitory. Weights in
projections that access excitatory units are non-negative. Weights from inhibitory
units are non-positive. In contrast, weights of projections accessing the input feature
array can have any sign. They have a window size of 5 × 5 and lead to excitatory
features in the same layer or belong to forward projections of excitatory feature
cells in the next higher layer.
The excitatory feature cells of Layer 1 and Layer 2 receive forward projections
from the 4 × 4 hyper-neighborhood in the layer below them. Connections between
Layer 2 and the topmost Layer 3 are different since the resolution drops from 12 × 9
to 1 × 1. Here, the forward and backward projections implement a full connectivity
between the excitatory feature cells of one layer and all feature cells of the other
layer. The backward projections of Layer 0 and Layer 1 access all feature cells of
a single hypercolumn in the next higher layer. 2 × 2 different backward projections
exist for each excitatory feature. In all layers except the topmost one lateral projec-
tions access all features of the 3 × 3 hyper-neighborhood around a feature cell. In
Layer 3 lateral projections are smaller because all feature cells are contained in a
1 × 1 hyper-neighborhood.
The projections of the inhibitory features are simpler. They access 5 × 5 windows
of all excitatory feature arrays within the same layer. In Layer 3, of course, this win-
dow size reduces to 1 × 1. While all projection units have linear transfer functions, a
smooth rectifying transfer function f st ( β = 10 , see Fig. 4.6(a) in Section 4.2.4) is
used for the output units of all feature cells.
The feature arrays are surrounded by a two pixel wide border. The activities of
the border cells are copied from feature cells using wrap-around.
10.4 Experimental Results
Because the BioID dataset does not specify which images constitute a training set
and a testing set, the dataset was divided randomly into 1000 training images (TRN)
and 521 test examples (TST). The network was trained for ten iterations on random
subsets of the training set with increasing size using backpropagation through time
(BPTT) and RPROP, as described in Chapter 6. The weighting of the quadratic error
increased linearly in time.
The two first excitatory feature arrays on the three lower layers are trained to
produce the desired output blobs that indicate the eye positions. All other features
are hidden. They are forced to have low mean activity.
Figure 10.5 shows the development of the trained network's output over time
when the test image from Fig. 10.3 is presented as input. One can observe that
the blobs signaling the locations of the eyes develop in a top-down fashion. After
the first iteration they appear only in the lowest resolution. This coarse localization
is used to bias the development of blobs in lower layers. After five iterations, the
Search WWH ::




Custom Search