Databases Reference
In-Depth Information
MPEG
bitstre am
Time
frequency
mapping
Quantization
and
coding
Input
Framing
Psycho-
acoustic
model
F I GU R E 17 . 4
The MPEG audio coding algorithms.
scalefactor . The scalefactor is used to make sure that the coefficients make use of the entire
range of the quantizer. The subband output is divided by the scalefactor before being linearly
quantized. There are a total of 63 scalefactors specified in the MPEG standard. Specification
of each scalefactor requires 6 bits.
To determine the number of bits to be used for quantization, the coder makes use of the
psychoacoustic model. The inputs to the model include the fast Fourier transform (FFT) of
the audio data as well as the signal itself. The model calculates the masking thresholds in each
subband, which in turn determine the amount of quantization noise that can be tolerated and
hence the quantization step size. As the quantizers all cover the same range, selection of the
quantization step size is the same as selection of the number of bits to be used for quantizing the
output of each subband. In Layer I the encoder has a choice of 14 different quantizers for each
band (plus the option of assigning 0 bits). The quantizers are all midtread quantizers ranging
from 3 levels to 65,535 levels. Each subband gets assigned a variable number of bits. However,
the total number of bits available to represent all the subband samples is fixed. Therefore, the
bit allocation can be an iterative process. The objective is to keep the noise-to-mask ratio more
or less constant across the subbands.
The output of the quantization and bit allocation steps are combined into a frame as shown
in Figure 17.5 . Because MPEG audio is a streaming format, each frame carries a header, rather
than having a single header for the entire audio sequence. The header is made up of 32 bits.
The first 12 bits comprise a sync pattern consisting of all 1s. This is followed by a 1-bit version
ID, a 2-bit layer indicator, and 1 bit to indicate CRC protection. The CRC protection bit is set
to 0 if there is no CRC protection and is set to a 1 if there is CRC protection. If the layer and
protection information is known, all 16 bits can be used for providing frame synchronization.
The next 4 bits make up the bit rate index, which specifies the bit rate in kbits/sec. There are
14 specified bit rates to choose from. This is followed by 2 bits that indicate the sampling
frequency. The sampling frequencies for MPEG-1 and MPEG-2 are different (one of the few
differences between the audio coding standards for MPEG-1 and MPEG-2) and are shown in
Table 17.1 . These bits are followed by a single padding bit. If the bit is “1,” the frame needs
an additional bit to adjust the bit rate to the sampling frequency. The next two bits indicate the
mode. The possible modes are “stereo,” “joint stereo,” “dual channel,” and “single channel.”
The stereo mode consists of two channels that are encoded separately but intended to be played
 
Search WWH ::




Custom Search