Information Technology Reference
In-Depth Information
(a)
(b)
(c)
Fig. 9.2. Some digits from the NIST dataset. Shown are: (a) centered images in the original
resolution (64 × 64); (b) subsampled to 16 × 16 pixels (pixelized); (c) bicubic interpolation to
original resolution (blurred).
9.2.1 NIST Digits Dataset
The first reconstruction experiment is done using the original NIST images of seg-
mented binarized handwritten digits [80]. They have been extracted by NIST from
hand printed sample forms. The digits are contained in a 128 × 128 window, but
their bounding box is typically much smaller. For this reason, the bounding box was
centered in a 64 × 64 window to produce the desired output Y . Figure 9.2(a) shows
some centered sample images from the NIST dataset. The input X to the network
consists of subsampled versions of the digits with resolution 16 × 16, shown for the
examples in Fig. 9.2(b), which were produced by averaging 4 × 4 pixels. Part (c)
of the figure demonstrates that bicubic interpolation is not an adequate method to
increase the resolution of the NIST digits since it produces blurred images.
9.2.2 Architecture for Super-Resolution
The network used for the super-resolution task is a very small instance of the Neural
Abstraction Pyramid architecture. Besides the input and the output feature arrays,
determined by the task, it has additional features only in the hidden layer. Such a
small network was chosen because it proved to be sufficient for the task.
The architecture of the network is illustrated in Figure 9.3. It consists of three
layers. The rightmost Layer 2 contains only a single feature array of resolution
16 × 16. The activities of its cells are set to the low resolution input image.
Layer 1 has resolution 32 × 32. It contains four feature arrays that produce a hid-
den representation of the digit. The leftmost Layer 0 contains only a single feature
array that is used as network output. It has the resolution 64 × 64.
The feature cells of the output feature have lateral and backward projections. The
weight matrix of the lateral projections has a size of 3 × 3. The 2 × 2 different back-
ward projections each access a single feature cell of each feature array in Layer 1.
This corresponds to the inverse of non-overlapping 2 × 2 forward projections for the
four Layer 1 features.
Feature cells in Layer 1 have all three types of projections. Forward projections
access 2 × 2 windows of the output feature array in Layer 0. Lateral projections
Search WWH ::




Custom Search