Mathematical Preliminaries for Lossy Coding - Introduction to Data Compression

Databases Reference

In-Depth Information

Some of the material presented in this chapter is not essential for understanding the techniques

described in this topic. However, to follow some of the literature in this area, familiarity

with these topics is necessary. We have marked these sections with a

. If you are primarily

interested in the techniques, you may wish to skip these sections, at least on first reading. On

the other hand, if you wish to delve more deeply into these topics, we have included a list of

resources at the end of this chapter that provide a more mathematically rigorous treatment of

this material.

When we were looking at lossless compression, one thing we never had to worry about

was how the reconstructed sequence would differ from the original sequence. By definition,

the reconstruction of a losslessly constructed sequence is identical to the original sequence.

However, there is only a limited amount of compression that can be obtained with lossless

compression. There is a floor (a hard one) defined by the entropy of the source, below which

we cannot drive the size of the compressed sequence. As long as we wish to preserve all of

the information in the source, the entropy, like the speed of light, is a fundamental limit.

The limited amount of compression available from using lossless compression schemes

may be acceptable in several circumstances. The storage or transmission resources available

to us may be sufficient to handle our data requirements after lossless compression. Or the

possible consequences of a loss of information may be much more expensive than the cost of

additional storage and/or transmission resources. This would be the case with the storage and

archiving of bank records; an error in the records could turn out to be much more expensive

than the cost of buying additional storage media.

If neither of these conditions hold—that is, resources are limited and we do not require

absolute integrity—we can improve the amount of compression by accepting a certain degree

of loss during the compression process. Performance measures are necessary to determine

the efficiency of our lossy compression schemes. For the lossless compression schemes, we

essentially used only the rate as the performance measure. That would not be feasible for lossy

compression. If rate were the only criterion for lossy compression schemes, where loss of

information is permitted, the best lossy compression scheme would be simply to throw away

all the data! Therefore, we need some additional performance measure, such as some measure

of the difference between the original and reconstructed data, which we will refer to as the

distortion in the reconstructed data. In the next section, we will look at some of the more

well-known measures of difference and discuss their advantages and shortcomings.

In the best of all possible worlds, we would like to incur the minimum amount of distor-

tion while compressing to the lowest rate possible. Obviously, there is a trade-off between

minimizing the rate and keeping the distortion small. The extreme cases are when we trans-

mit no information, in which case the rate is zero, or keep all the information, in which case

the distortion is zero. The rate for a discrete source is simply the entropy. The study of the

situations between these two extremes is called rate distortion theory . In this chapter we will

take a brief look at some important concepts related to this theory.

We will need to expand the dictionary of models available for our use, for several reasons.

First, because we are now able to introduce distortion, we need to determine how to add

distortion intelligently. For this, we often need to look at the sources somewhat differently

than we have done previously. Another reason is that we will be looking at compression

schemes for sources that are analog in nature, even though we have treated them as discrete

sources in the past. We need models that more precisely describe the true nature of these

Introduction to Data Compression

Search WWH ::

Custom Search

Home