Information Technology Reference
In-Depth Information
15
16
17
18
19
Output
V4/IT
10
11
12
13
14
V2
5
6
7
8
9
0
1
2
3
4
V1
Figure 8.12: Objects used in the object recognition model.
LGN_On
LGN_Off
Figure 8.11: Structure of the object recognition model, hav-
ing on- and off-center LGN inputs, a V1-like oriented edge
detector layer, followed by V2 and V4/IT that produce in-
creasingly invariant and featurally complex representations
(V4 and IT are collapsed together due to the simplicity of
the objects and scale of the model). The output represents
some further use of object representations by other parts of the
brain.
The connectivity patterns also reflect the relatively
large scale of this model. As in the previous model,
we assume that all of the units within a given hypercol-
umn are connected to the same inputs. We further as-
sume that neighboring hypercolumns connect with par-
tially overlapping, but correspondingly offset, inputs.
Thus, there is some redundancy (due to the overlap) in
the coding of a given input area across multiple hyper-
columns, but basically different hypercolumns process
different information.
Although each hypercolumn of units processes dif-
ferent parts of the input, the hypercolumns are basically
doing the same thing — extracting features over mul-
tiple locations and sizes. Furthermore, no part of the
input is any different than any other — objects can ap-
pear anywhere. Thus, each hypercolumn of units can
share the same set of weights, which saves significantly
on the amount of memory required to implement the
network. We use this weight linking or weight sharing
(LeCun et al., 1989) in all the multihypercolumn layers
in this model. Weight sharing allows each hypercolumn
to benefit from the experiences in all the other hyper-
columns, speeding up learning time (see following de-
tails section for more information).
The environment experienced by the model contains
20 objects composed of combinations of 3 out of 6 hor-
izontal and vertical lines (figure 8.12). The regularity
to these objects in their basic features is critical for en-
abling the network to generalize to novel objects, as de-
Hypercolumns are indicated by smaller boxes within
layers.
The need to represent a large cortical scale requires
the introduction of some additional structure, particu-
larly with respect to the way that inhibition is computed.
The idea behind this structure is that inhibition acts at
multiple scales, with stronger inhibition among neurons
relatively close together (i.e., within a single cortical
hypercolumn), but also significant inhibition communi-
cated across larger distances via longer range excitatory
cortical connectivity that projects onto local inhibitory
interneurons.
These multiscale inhibitory dynamics are imple-
mented in the model with two nested levels of kWTA
inhibition. Units within a single hypercolumn compete
amongst themselves for a given kWTA activity level,
while at the same time all the units across hypercolumns
within the layer compete for a larger kWTA activity
level reflecting the longer-range inhibition. The net in-
hibition for each unit is then the maximum of that com-
puted by each of these kWTA computations.
Search WWH ::




Custom Search