Databases Reference
In-Depth Information
(see Problem 1 at the end of this chapter). The error introduced in the
{ θ n }
sequence is the
sum of squares of the
θ n s that are set to zero. The magnitudes of these elements are quite
small and, therefore, the total error introduced into the reconstructed sequence is quite small
also.
We could reduce the number of samples we need to code because most of the information
contained in each pair of values is put into one element of each pair. As the other element
of the pair contains very little information, we could discard it without a significant effect on
the fidelity of the reconstructed sequence. The transform in this case acts on pairs of values;
therefore, the maximum reduction in the number of significant samples is a factor of two. We
can extend this idea to longer blocks of data. By compactingmost of the information in a source
output sequence into a few elements of the transformed sequence using a reversible transform,
and then discarding the elements of the sequence that do not contain much information, we
can get a large amount of compression. This is the basic idea behind transform coding.
In Example 13.2.1 we present a geometric view of the transform process. We can also
examine the transform process in terms of the changes in statistics between the original and
transformed sequences. It can be shown that we can get the maximum amount of compaction
if we use a transform that decorrelates the input sequence; that is, the sample-to-sample
correlation of the transformed sequence is zero. The first transform to provide decorrelation
for discrete data was presented by Hotelling [ 183 ]inthe Journal of Educational Psychology in
1933. He called his approach the method of principal components . The analogous transform
for continuous functions was obtained by Karhunen [ 184 ] and Loéve [ 185 ]. This decorrelation
approach was first utilized for compression, in what we now call transform coding, by Kramer
and Mathews [ 186 ], and Huang and Schultheiss [ 187 ].
Transform coding consists of three steps. First, the data sequence
{
x n }
is divided into
{ θ n }
blocks of size N . Each block is mapped into a transform sequence
using a reversible
mapping in a manner similar to that described in Example 13.2.1. As shown in the example,
different elements of each block of the transformed sequence generally have different statistical
properties. In Example 13.2.1, most of the energy of the block of two input values was
contained in the first element of the block of two transformed values, while very little of the
energy was contained in the second element. This meant that the second element of each
block of the transformed sequence would have a small magnitude, while the magnitude of the
first element could vary considerably depending on the magnitude of the elements in the input
block. The second step consists of quantizing the transformed sequence. The quantization
strategy used will depend on three main factors: the desired average bit rate, the statistics of the
various elements of the transformed sequence, and the effect of distortion in the transformed
coefficients on the reconstructed sequence. In Example 13.2.1, we could take all the bits
available to us and use them to quantize the first coefficient. In more complex situations,
the strategy used may be very different. In fact, we may use different techniques, such as
differential encoding and vector quantization [ 151 ], to encode the different coefficients.
Finally, the quantized value needs to be encoded using some binary encoding technique.
The binary codingmay be as simple as using a fixed-length code or as complex as a combination
of run-length coding and Huffman or arithmetic coding. We will see an example of the latter
when we describe the JPEG algorithm.
Search WWH ::




Custom Search