Related Work - Hierarchical Neural Networks for Image Interpretation

Information Technology Reference

In-Depth Information

G 4

20 × 15

G 3

L 3

40 × 30

80 × 60

G 2

L 2

160 × 120

G 1

L 1

320 × 240

L 0

(a) (b)

Fig. 3.1. Image pyramids: (a) Gaussian pyramid, representing only coarse structures at the

higher levels; (b) Laplacian pyramid, containing the differences between Gaussian levels

(amplified for better visibility).

G 0

a wide range of signals, including images, but offer only limited adaptability to a

specific dataset.

Image Pyramids. A widely used tool in image processing and computer graphics

are multiresolution representations called image pyramids. In an image pyramid,

the image is not only represented at the given high resolution G 0 , but through a

sequence G 0 ,G 1 ,...,G k of 2D pixel arrays with decreasing resolutions.

A reduce operation computes the next higher level G i +1 from the level G i

using only local operations. Most common is the dyadic Gaussian pyramid, where

a pixel G i +1 ( i,j ) is computed as the weighted average of the pixels around the

corresponding position G i (2 i + 2 , 2 j + 2 ) in the lower level. Each step reduces the

image resolution by a factor of two in both dimensions. Figure 3.1(a) shows such

a Gaussian pyramid for an example image. Its total size is slightly less than 1 3

the

size of G 0 .

While image details are visible only in the lower levels of the Gaussian pyramid,

the higher levels make larger objects accessible in small windows. This allows one

to design coarse-to-fine algorithms [196] for image analysis. Such algorithms start

to analyze an image at the coarsest resolution that can be processed quickly. As

they proceed to the finer levels, they use coarse results to bias the finer analysis.

For instance, when searching for an object, a small number of hypotheses can be

established by inspecting the coarse resolution. The finer resolutions are analyzed

Search WWH ::

Custom Search

Home