Information Technology Reference
In-Depth Information
4.3.2 Binarization of Handwriting
In the previous example, we have seen that local interaction in a hierarchy can im-
plement globally interesting computations. However, the representational power of
the network used for contrast normalization was limited, since the constant number
of features per layer did not counteract the decrease of resolution towards the top of
the pyramid.
In the second example network the number of features increases by a factor of
two when going up one layer, as described in Section 4.1.1. The increasing num-
ber of features is used to build a hierarchical model of handwriting for the task of
binarization. Figure 4.11 shows some examples from the dataset used for the exper-
iments. The original images were extracted by Siemens AG from large-sized letters,
called flats, for the purpose of automated mail sorting. They contain handwritten
German ZIP codes on a relatively dark background.
Binarization of these images is one step towards the recognition of the ZIP
codes. It assigns the pixels to one of two classes: the foreground or the background.
The goal is to assign the pixels belonging to the strokes of the digits to the fore-
ground class, and all other pixels to the background. Binarization discards variance
that is not relevant for recognition, such as brightness of the lighting and the struc-
ture of the paper and keeps recognition-relevant aspects, such as the shape of the
lines. This task is non-trivial due to different sources of noise and variance. For in-
stance, the line thickness varies considerably because different pens have been used
to write the digits. Next, the image contrast is sometimes low because of the dark-
ness of the paper and the weakness of the writing device. Furthermore, the structure
of the paper and background clutter are sources of noise. Finally, due to the height
of the letters, some images have been captured outside of the camera's focal plane
which leads to unsharp line borders.
Histogram-based thresholding techniques are among the most popular binariza-
tion methods described in the literature [122]. They assign pixels to the two classes
based on the intensity alone. Pixels that are darker than a threshold are assigned
to the foreground and all other pixels to the background class. If the intensity his-
togram of the image is bimodal, the two peaks correspond to the foreground and the
background pixels. One can search for a local minimum in the smoothed histogram
between the two peaks to determine a binarization threshold. Figure 4.11(b) shows
thresholded versions of the original images. It can be observed that thresholding
breaks weak lines into pieces and also assigns small dark clutter to the foreground.
(a)
(b)
Fig. 4.11. ZIP code binarization dataset: (a) original grayscale images; (b) binarized using
thresholding.
Search WWH ::




Custom Search