Information Technology Reference
In-Depth Information
(a)
(b)
(c) (d)
Fig. 3.2. Image compression using pruned pyramids: (a) original image of a letter; (b) reso-
lution used after pruning (darker shading corresponds to higher resolution, the compression
ratio is 150:1); (c) reconstructed address region; (d) difference of the reconstruction to the
original (amplified for visibility).
only at the corresponding positions to verify and to refine the hypotheses. This saves
computational costs, compared to a high-resolution search.
Burt and Adelson [38] proposed the use of differences L 0 ,L 1 ,...,L k− 1 be-
tween the levels of a Gaussian pyramid as low-entropy representation for image
compression. The set of L i 's is called a Laplacian pyramid. The L i are computed as
pixel-wise differences between G i and its estimate
e
G i = expand ( G i +1 ) , obtained
by supersampling G i +1 to the higher resolution and interpolating the missing values.
Fig. 3.1(b) shows the Laplacian pyramid for the example. It decomposes the image
into a sequence of spatial frequency bands. Perfect reconstruction of G 0 is possible
when G k and L 0 ,L 1 ,...,L k− 1 are given by using the recursion G i =
e
G i + L i .
Since for natural images the values of L i are mostly close to zero, they can be
compressed using quantization. The reconstruction proceeds in a top-down fashion.
Thus, progressive transmission of images is possible with this scheme.
Since the pyramid has a tree structure, it can be pruned to reduce its size. This
method works well if the significant image details are confined within small regions.
Figure 3.2 shows an image of a letter with size 2,048 × 1,412. Most of the area can
be represented safely by using only the lower resolution levels, while the higher
resolutions concentrate at the edges of the print. Although pruning compresses the
image by a ratio of 150:1, the address is still clearly readable.
Another application of image pyramids is hierarchical block matching, proposed
by Bierling [31] for motion estimation in video sequences. Since the higher levels of
the pyramid are increasingly invariant to translations, image motion is estimated in
the coarsest resolution first. The estimated displacement vectors are used as starting
Search WWH ::




Custom Search