Neural Abstraction Pyramid Architecture - Hierarchical Neural Networks for Image Interpretation

Information Technology Reference

In-Depth Information

Layer 0: Foreground/Background. Figure 4.13 summarizes the templates used

for the processing elements of the pyramid's bottom layer features and shows the

stable response of the network to a test pattern consisting of three circles.

The input array I is set to a version of the input image that has been shifted in

intensity to make the mean value equal to 0.5. Furthermore, the image was reduced

in size by a factor of two if it did not fit into a 232 × 88 window, and then smoothly

framed to match the array size of 240 × 96.

The input projections to the forward feature F and the backward feature B have

a center-surround structure. They have been set to differences between 3 × 3 and 7 × 7

binomial kernels. The central weight has an amplitude of 4 . 031 and the projections

have a DC part of ± 1 . 5 . This is offset by a bias of 0 . 75 to the background input

projection. The foreground bias is set to − 0 . 8 , suppressing responses to intensities

that are slightly larger than average. Hence, the forward potentials of the foreground

react best to a dark center that is darker than its neighborhood (a line), and the

forward potentials of the background react best to a bright center that is surrounded

by dark lines (a loop center).

Lateral projections to the two excitatory features have a specific excitatory and

an unspecific inhibitory part. Excitation comes from the 3 × 3 neighborhood of the

same feature and inhibition from a 5 × 5 window of the sum S F B of the two fea-

tures. The feature cells do not excite themselves but inhibit themselves via S F B .

Hence, the lateral connectivity favors blob-like activities that extend over multiple

neighboring pixels and suppresses isolated active cells. The lateral excitation for

the background is stronger than the one for the foreground. The opposite applies

to the inhibition. Thus, the lateral competition between the two features favors the

background. Initial foreground responses are removed if they are not supported by

neighboring foreground pixels or by edges detected from Layer 1.

Top-down support comes from the backward projections which are the inverse

of excitatory forward projections to the edge-features. They expand the edge repre-

sentation to the higher-resolution foreground/background representation. Unspecific

backward inhibition comes from the sum of the edges S E .

Layer 1: Edges. The middle layer of the binarization network is summarized in

Figure 4.14. Four features detect step edges. E T responds to the top edge of hori-

zontal lines and E B to their bottom edge. The left and right edges of vertical lines

excite E L and E R .

The specific excitatory weights of the 6 × 6 forward projections resemble the

oriented foreground/background double line that is characteristic for step edges in

Layer 0. Unspecific forward inhibition comes from S F B weighted with a 6 × 6 bino-

mial kernel. The forward projections have a bias weight of − 0 . 05 to prevent reaction

to spurious edges. The sum of the edge features is computed by S E .

Lateral projections mediate cooperation between aligned edges of same or sim-

ilar orientations by 3 × 3 excitatory kernels and unspecific competition via a 5 × 5

binomial kernel, folded with S E . Since edge cells do not excite themselves, they

must be supported by other edges or line features to survive the competition.

Hierarchical Neural Networks for Image Interpretation

Search WWH ::

Custom Search

Home