Information Technology Reference
In-Depth Information
1.6 Compression principles
In a PCM digital system the bit rate is the product of the sampling rate and the number of bits in each sample and
this is generally constant.
Nevertheless the information rate of a real signal varies. In all real signals, part of the signal is obvious from what
has gone before or what may come later and a suitable receiver can predict that part so that only the true
information actually has to be sent. If the characteristics of a predicting receiver are known, the transmitter can omit
parts of the message in the knowledge that the receiver has the ability to re-create it. Thus all encoders must
contain a model of the decoder.
One definition of information is that it is the unpredictable or surprising element of data. Newspapers are a good
example of information because they only mention items which are surprising. Newspapers never carry items about
individuals who have not been involved in an accident as this is the normal case. Consequently the phrase 'no
news is good news' is remarkably true because if an information channel exists but nothing has been sent then it is
most likely that nothing remarkable has happened.
The unpredictability of the punch line is a useful measure of how funny a joke is. Often the build-up paints a certain
picture in the listener's imagination, which the punch line destroys utterly. One of the author's favourites is the one
about the newly married couple who didn't know the difference between putty and petroleum jelly - their windows
fell out.
The difference between the information rate and the overall bit rate is known as the redundancy. Compression
systems are designed to eliminate as much of that redundancy as practicable or perhaps affordable. One way in
which this can be done is to exploit statistical predictability in signals. The information content or entropy of a
sample is a function of how different it is from the predicted value. Most signals have some degree of predictability.
A sine wave is highly predictable, because all cycles look the same. According to Shannon's theory, any signal
which is totally predictable carries no information. In the case of the sine wave this is clear because it represents a
single frequency and so has no bandwidth.
At the opposite extreme a signal such as noise is completely unpredictable and as a result all codecs find noise
difficult . The most efficient way of coding noise is PCM. A codec which is designed using the statistics of real
material should not be tested with random noise because it is not a representative test. Second, a codec which
performs well with clean source material may perform badly with source material containing superimposed noise.
Most practical compression units require some form of pre-processing before the compression stage proper and
appropriate noise reduction should be incorporated into the pre-processing if noisy signals are anticipated. It will
also be necessary to restrict the degree of compression applied to noisy signals.
All real signals fall part-way between the extremes of total predictability and total unpredictability or noisiness. If the
bandwidth (set by the sampling rate) and the dynamic range (set by the wordlength) of the transmission system are
used to delineate an area, this sets a limit on the information capacity of the system. Figure 1.5 (a) shows that most
real signals only occupy part of that area. The signal may not contain all frequencies, or it may not have full
dynamics at certain frequencies. "/>
 
Search WWH ::




Custom Search