Databases Reference
In-Depth Information
17.4.2 MPEG-4 AAC
MPEG-4 AAC adds a perceptual noise substitution (PNS) tool and substitutes a long term
prediction (LTP) tool for the interband prediction tool in the spectral coding block. In the quan-
tization and coding section MPEG-4 AAC adds the options of Transform-Domain Weighted
Interleave Vector Quantization (TwinVQ) and Bit Sliced Arithmetic Coding (BSAC).
Perceptual Noise Substitution (PNS)
There are portions of music that sound like noise. Although this may sound like a harsh (or
realistic) subjective evaluation, that is not what is meant here. What is meant by noise here
is a portion of audio where the MDCT coefficients are stationary without containing tonal
components [ 222 ]. This kind of noiselike signal is the hardest to compress. However, at the
same time it is very difficult to distinguish one noiselike signal from another. MPEG-4 AAC
makes use of this fact by not transmitting such noiselike scalefactor bands. Instead the decoder
is alerted to this fact and the power of the noiselike coefficients in this band is sent. The decoder
generates a noiselike sequence with the appropriate power and inserts it in place of the unsent
coefficients.
Long Term Prediction
The interband prediction in MPEG-2 AAC is one of the more computationally expensive parts
of the algorithm. MPEG-4 AAC replaces that with a cheaper long term prediction (LTP)
module.
TwinVQ
The Transform-Domain Weighted Interleave Vector Quantization (TwinVQ) [ 282 ] option is
suggested in the MPEG-4 AAC scheme for low bit rates. Developed at NTT in the early 1990s,
the algorithm uses a two-stage process for flattening the MDCT coefficients. In the first stage,
a linear predictive coding algorithm is used to obtain the LPC coefficients for the audio data
corresponding to the MDCT coefficients. These coefficients are used to obtain the spectral
envelope for the audio data. Dividing the MDCT coefficients with this spectral envelope
results in some degree of “flattening” of the coefficients. The spectral envelope computed
from the LPC coefficients reflects the gross features of the envelope of the MDCT coefficients.
However, it does not reflect any of the fine structure. This fine structure is predicted from
the previous frame and provides further flattening of the MDCT coefficients. The flattened
coefficients are interleaved and grouped into subvectors and quantized. The flattening process
reduces the dynamic range of the coefficients, allowing them to be quantized using a smaller
VQ codebook than would otherwise have been possible. The flattening process is reversed in
the decoder as the LPC coefficients are transmitted to the decoder.
Search WWH ::




Custom Search