Databases Reference
In-Depth Information
of the size of the original data. In this particular example, the compression ratio calculated in
this manner would be 75%.
Another way of reporting compression performance is to provide the average number of
bits required to represent a single sample. This is generally referred to as the rate . For example,
in the case of the compressed image described above, if we assume 8 bits per byte (or pixel),
the average number of bits per pixel in the compressed representation is 2. Thus, we would
say that the rate is 2 bits per pixel.
In lossy compression, the reconstruction differs from the original data. Therefore, in
order to determine the efficiency of a compression algorithm, we have to have some way of
quantifying the difference. The difference between the original and the reconstruction is often
called the distortion . (We will describe several measures of distortion in Chapter 8.) Lossy
techniques are generally used for the compression of data that originate as analog signals, such
as speech and video. In compression of speech and video, the final arbiter of quality is human.
Because human responses are difficult to model mathematically, many approximate measures
of distortion are used to determine the quality of the reconstructed waveforms. We will discuss
this topic in more detail in Chapter 8.
Other terms that are also used when talking about differences between the reconstruction
and the original are fidelity and quality . When we say that the fidelity or quality of a recon-
struction is high, we mean that the difference between the reconstruction and the original is
small. Whether this difference is a mathematical difference or a perceptual difference should
be evident from the context.
1.2 Modeling and Coding
While reconstruction requirements may force the decision of whether a compression scheme
is to be lossy or lossless, the exact compression scheme we use will depend on a number of
different factors. Some of the most important factors are the characteristics of the data that need
to be compressed. A compression technique that will workwell for the compression of text may
not work well for compressing images. Each application presents a different set of challenges.
There is a saying attributed toBobKnight, the former basketball coach at IndianaUniversity
and Texas TechUniversity: “If the only tool you have is a hammer, you approach every problem
as if it were a nail.” Our intention in this topic is to provide you with a large number of tools
that you can use to solve a particular data compression problem. It should be remembered that
data compression, if it is a science at all, is an experimental science. The approach that works
best for a particular application will depend to a large extent on the redundancies inherent in
the data.
The development of data compression algorithms for a variety of data can be divided
into two phases. The first phase is usually referred to as modeling . In this phase, we try to
extract information about any redundancy that exists in the data and describe the redundancy
in the form of a model. The second phase is called coding . A description of the model and
a “description” of how the data differ from the model are encoded, generally using a binary
alphabet. The difference between the data and the model is often referred to as the residual .
Search WWH ::




Custom Search