Audio Coding - Introduction to Data Compression

Databases Reference

In-Depth Information

17.4.2 MPEG-4 AAC

MPEG-4 AAC adds a perceptual noise substitution (PNS) tool and substitutes a long term

prediction (LTP) tool for the interband prediction tool in the spectral coding block. In the quan-

tization and coding section MPEG-4 AAC adds the options of Transform-Domain Weighted

Interleave Vector Quantization (TwinVQ) and Bit Sliced Arithmetic Coding (BSAC).

Perceptual Noise Substitution (PNS)

There are portions of music that sound like noise. Although this may sound like a harsh (or

realistic) subjective evaluation, that is not what is meant here. What is meant by noise here

is a portion of audio where the MDCT coefficients are stationary without containing tonal

components [ 222 ]. This kind of noiselike signal is the hardest to compress. However, at the

same time it is very difficult to distinguish one noiselike signal from another. MPEG-4 AAC

makes use of this fact by not transmitting such noiselike scalefactor bands. Instead the decoder

is alerted to this fact and the power of the noiselike coefficients in this band is sent. The decoder

generates a noiselike sequence with the appropriate power and inserts it in place of the unsent

coefficients.

Long Term Prediction

The interband prediction in MPEG-2 AAC is one of the more computationally expensive parts

of the algorithm. MPEG-4 AAC replaces that with a cheaper long term prediction (LTP)

module.

TwinVQ

The Transform-Domain Weighted Interleave Vector Quantization (TwinVQ) [ 282 ] option is

suggested in the MPEG-4 AAC scheme for low bit rates. Developed at NTT in the early 1990s,

the algorithm uses a two-stage process for flattening the MDCT coefficients. In the first stage,

a linear predictive coding algorithm is used to obtain the LPC coefficients for the audio data

corresponding to the MDCT coefficients. These coefficients are used to obtain the spectral

envelope for the audio data. Dividing the MDCT coefficients with this spectral envelope

results in some degree of “flattening” of the coefficients. The spectral envelope computed

from the LPC coefficients reflects the gross features of the envelope of the MDCT coefficients.

However, it does not reflect any of the fine structure. This fine structure is predicted from

the previous frame and provides further flattening of the MDCT coefficients. The flattened

coefficients are interleaved and grouped into subvectors and quantized. The flattening process

reduces the dynamic range of the coefficients, allowing them to be quantized using a smaller

VQ codebook than would otherwise have been possible. The flattening process is reversed in

the decoder as the LPC coefficients are transmitted to the decoder.

Search WWH ::

Custom Search

Home