Neural Abstraction Pyramid Architecture - Hierarchical Neural Networks for Image Interpretation

Information Technology Reference

In-Depth Information

and it is determined by the index L kl that depends on the feature array kl and

the projection index p , but not on the cell position ( i,j ) within its feature array.

Since the connections have to be local, they originate either in same layer L kl

l ∗

= l

for lateral projections or in an adjacent layer L kl

= l ± 1 for forward/backward

projections.

The feature index k ∗

of the feature cell accessed by a weight q of projection p

is determined by the index K pq

∈ { 0 ,...,K l ∗

− 1 } , where K l ∗ is the number of

feature arrays in the source layer l ∗

. Hence, access to any feature is allowed.

The position of the accessed feature cell depends on the position ( i,j ) of the

computed cell. A function Υ ll ∗ ( x ) maps positions from layer l to layer l ∗

. If the

resolution of the two layers differs by a factor of two it is computed as follows:

[ forward ]

2 x : l ∗

= l − 1

[ lateral ]

x : l ∗

Υ ll ∗ ( x ) =

= l

(4.8)

[ backward ]

b x/ 2 c : l ∗

= l + 1

In case that the source layer l ∗

consists of only a single hypercolumn at position

(0 , 0) , all positions in l are mapped to this hypercolumn: Υ ll ∗ ( x ) = 0 . The hyper-

column of the accessed feature depends also on the weight q . An offset ( I p kl , J p kl )

is added to the corresponding position ( Υ ll ∗ ( i ) ,Υ ll ∗ ( j )) . These offsets are usually

small and access only a M × N hyper-neighborhood of ( Υ ll ∗ ( i ) ,Υ ll ∗ ( j )) . For for-

ward projections, the offsets are usually chosen such that M and N are even since

the offsets (0,0), (0,1), (1,0), and (1,1) describe the source-hypercolumns that cor-

respond to a higher-level hypercolumn when the resolution is changed by a fac-

tor of two between the layers. Common are 4 × 4 forward projections that overlap

with eight neighboring projections. For lateral projections, odd dimensions of the

neighborhood are used to produce a symmetric connection structure. A 3 × 3 lateral

neighborhood is common.

All feature cells of a feature array kl share the same forward and lateral projec-

tions. This weight sharing is motivated by the success of convolutional neural net-

works (see Section 3.1.2). While it is not biologically plausible that distant weights

are forced to have the same value, it is likely that similar stimuli that occur at differ-

ent positions of a feature map lead to the development of similar feature detectors.

This translational invariance of feature detection that must be learned by cortical

feature maps is prewired in the Neural Abstraction Pyramid architecture. It is ap-

propriate for the processing of images since low-level stimuli, like edges and lines,

are likely to occur at several positions in the image.

Weight sharing leads to descriptions of the processing elements with templates

that have few parameters. It also allows for a sharing of examples since a single im-

age contains multiple small windows with different instances of low-level features.

Both few parameters and example sharing facilitate generalization. The degree of

weight sharing is high in the lower layers of the network and decreases towards the

top of the pyramid. Low-complexity features are described by few weights, while

many parameters are needed to describe the extraction of complex features. Hence,

the network is able to learn low-level representations from relatively few examples,

Hierarchical Neural Networks for Image Interpretation

Search WWH ::

Custom Search

Home