Binarization of Matrix Codes - Hierarchical Neural Networks for Image Interpretation

Information Technology Reference

In-Depth Information

forward, lateral, and backward projections with linear transfer functions. Forward

projections come from 4 × 4 windows of all feature arrays in the layer below. Lateral

projections originate from the 5 × 5 hyper-neighborhood in the same layer and back-

ward projections access a single cell of all feature arrays in the layer above. The

weights can have positive or negative values and are allowed to change their sign

during training. The network has a total of 11,910 different weights. Most of them

are located in the top layer since the few weights in the lower layers are shared far

more often than the ones in the higher layers.

The version of the Neural Abstraction Pyramid network that is used for binariza-

tion has relatively few feature arrays. The reason for this restriction was the need to

limit the computational effort of simulating the pyramid on a PC. Due to the rel-

atively high resolution of the input images, the iterative binarization of one Data

Matrix code required about two seconds on a Pentium 4 1.7GHz PC.

The undegraded Data Matrix images as well as their degraded versions are pre-

sented to the network without any preprocessing. One of the feature arrays in the

bottom layer is used as network output.

The target values that are used as the desired output for the supervised training

are computed using the adaptive thresholding method for the undegraded images.

The network is trained to iteratively produce them not only for the original images,

but for the degraded versions of these images as well. This approach has the ad-

vantage that the effort for producing a desired output for low-quality images is not

necessary. If one wanted to produce a desired output for the degraded images with-

out relying on the original versions, one would need to use time-consuming manual

labeling which is avoided by using the adaptive thresholding for the undegraded

originals.

The 515 high-contrast images were partitioned randomly into 334 training im-

ages (TRN) and 181 test examples (TST). For each example, one degraded version is

added to the sets. The network is trained for ten iterations with a linearly increasing

error-weight using backpropagation through time (BPTT) and RPROP, as described

in Section 6.

8.6 Experimental Results

After training, the network is able to iteratively solve the binarization task. Fig-

ure 8.11 displays how the activities of all features evolve over time for one of the

degraded test examples. It can be seen that the lower layers represent the cell struc-

ture of the code, while the higher layers are dominated by representations of the

background level and the local black-and-white ratio. One can furthermore observe

that the network performs an iterative refinement of an initial solution with most

changes occurring in the first few iterations and fewer changes towards the end of

the computation. In fact, the activities of iteration 7 and 11 are hardly distinguish-

able.

In Figure 8.12, the activities of the two Layer 0 feature arrays are displayed

in more detail. The upper row shows the development of the output. In the first

Search WWH ::

Custom Search

Home