Information Technology Reference
In-Depth Information
inhibitory
excitatory
Output
Edges
Lines
Input
Layer 0 (240 x 96 x 2)
Layer 1 (120 x 48 x 4)
Layer 2 (60 x 24 x 8)
Fig. 4.12. ZIP code binarization - network architecture. The Neural Abstraction Pyra-
mid consists of three layers. The bottom layer represents the image in terms of fore-
ground/background features. The middle layer contains detectors for horizontal and vertical
step edges. In the top layer, the lines are represented by the activities of eight orientation
selective line features.
This behavior is not ideal for recognition. The structure of digits is altered consider-
ably by broken lines and additional foreground pixels may also mislead recognition
especially if they are close to the lines.
The reason for these binarization problems is the limited use of context infor-
mation in the thresholding method. Only global context via the intensity histogram
is used to determine the binarization threshold, but the local context of a pixel is
not considered for the binarization decision. In the following, a Neural Abstraction
Pyramid is described that makes this decision based on the local context. The idea
motivating the network's construction is to detect the lines and use them to bias bi-
narization. A pixel belonging to a line should be assigned to the foreground class,
even if it is not much darker than its neighborhood. On the other hand, dark pixels
should be assigned to the background if they are not supported by a line.
The network's architecture is sketched in Figure 4.12. It consists of three layers
that represent the image at three levels of abstraction:
Layer 0 contains the input image, two excitatory feature arrays that represent the
foreground/background assignment, and one inhibitory feature array that contains
the sums of the foreground and the background features.
Layer 1 contains four feature arrays that represent horizontal and vertical step
edges. One inhibitory feature contains the sum of the edges.
Layer 2 contains eight excitatory feature arrays that represent lines in different
orientations. Two inhibitory feature arrays compute the sums of the more hori-
zontal and the more vertical lines, respectively. One inhibitory feature array sums
lines of all orientations.
Search WWH ::




Custom Search