Information Technology Reference
In-Depth Information
Figure 1.16: The visual object types supported by each visual profile of MPEG-4.
1.11 Audio compression
Perceptive coding in audio relies on the principle of auditory masking, which is treated in detail in section 4.1 .
Masking causes the ear/brain combination to be less sensitive to sound at one frequency in the presence of
another at a nearby frequency. If a first tone is present in the input, then it will mask signals of lower level at nearby
frequencies. The quantizing of the first tone and of further tones at those frequencies can be made coarser. Fewer
bits are needed and a coding gain results. The increased quantizing error is allowable if it is masked by the
presence of the first tone.
1.11.1 Sub-band coding
Sub-band coding mimics the frequency analysis mechanism of the ear and splits the audio spectrum into a large
number of different bands.
Signals in these bands can then be quantized independently. The quantizing error which results is confined to the
frequency limits of the band and so it can be arranged to be masked by the program material.
The techniques used in Layers I and II of MPEG audio are based on sub- band coding as are those used in DCC
(Digital Compact Cassette).
1.11.2 Transform coding
In transform coding the time-domain audio waveform is converted into a frequency domain representation such as
a Fourier, discrete cosine or wavelet transform (see Chapter 3 ). Transform coding takes advantage of the fact that
the amplitude or envelope of an audio signal changes relatively slowly and so the coefficients of the transform can
be transmitted relatively infrequently. Clearly such an approach breaks down in the presence of transients and
adaptive systems are required in practice. Transients cause the coefficients to be updated frequently whereas in
stationary parts of the signal such as sustained notes the update rate can be reduced. Discrete cosine transform
(DCT) coding is used in Layer III of MPEG audio and in the compression system of the Sony MiniDisc.
1.11.3 Predictive coding
In a predictive coder there are two identical predictors, one in the coder and one in the decoder. Their job is to
examine a run of previous data values and to extrapolate forward to estimate or predict what the next value will be.
This is subtracted from the actual next code value at the encoder to produce a prediction error which is transmitted.
The decoder then adds the prediction error to its own prediction to obtain the output code value again.
Prediction can be used in the time domain, where sample values are predicted, or in the frequency domain where
coefficient values are predicted. Time-domain predictive coders work with a short encode and decode delay and
are useful in telephony where a long loop delay causes problems. Frequency prediction is used in AC-3 and MPEG
AAC.
1.12 MPEG bitstreams
 
Search WWH ::




Custom Search