Unsupervised Learning - Hierarchical Neural Networks for Image Interpretation

Information Technology Reference

In-Depth Information

inhibitory

Feature sum

excitatory

Input

Layer 1

Layer 2

(8x8x16)

Layer 3

(4x4x32)

Layer 4

(2x2x64)

Layer 5

Layer 0 (32x32x4)

(16x16x8)

(1x1x128)

Fig. 5.1. Learning a hierarchy of sparse features - network architecture. The Neural Abstrac-

tion Pyramid consists of six layers. Only forward projections are used. Excitation is specific

while unspecific inhibition is mediated by the subsampled smoothed feature sums.

that represents the subsampled sum of the features in the layer below. This feature

is inhibitory.

Patterns are presented to the input feature array located in the bottom layer of

the network. The input is analyzed by four excitatory feature arrays of Layer 0 that

compute center-surround features. They each have a single lateral projection with

direct access to the input array. The weights of these projections have a difference-

of-Gaussian structure with two different scales and two polarities. Fine and coarse

foreground and background features are detected. The projection unit has a linear

transfer function and contributes with weight one to the output unit which has a

saturating rectifying transfer function f p sat ( α = 1 , see Section 4.2.4) that limits

the activities to the interval [0 , 1] . This transfer function is also used for the output

units of the excitatory feature cells in the higher layers.

The feature sum S l has only a single lateral projection with direct access to all

excitatory features of a layer. It weights the 3 × 3 neighborhood of its hypercolumn

with a binomial kernel that is scaled with a gain factor. The gain decreases with

height, such that the central weight decreases from 0.125 in Layer 0 to 0.015625 in

Layer 5. Both the transfer function of the projection unit and the one of its output

unit are linear. On the next higher layer the inhibitory feature array

S l computes the

average of a 2 × 2 window of S l .

The basic processing elements used for the excitatory features in Layer 1 to

Layer 5 have two projections. One is the specific excitatory forward projection that

directly accesses overlapping 4 × 4 windows of all excitatory feature arrays in the

layer below. The other is the unspecific inhibitory projection that accesses the sub-

Search WWH ::

Custom Search

Home