Audio Coding - Introduction to Data Compression

Databases Reference

In-Depth Information

MPEG

bitstre am

Time

frequency

mapping

Quantization

and

coding

Input

Framing

Psycho-

acoustic

model

F I GU R E 17 . 4

The MPEG audio coding algorithms.

scalefactor . The scalefactor is used to make sure that the coefficients make use of the entire

range of the quantizer. The subband output is divided by the scalefactor before being linearly

quantized. There are a total of 63 scalefactors specified in the MPEG standard. Specification

of each scalefactor requires 6 bits.

To determine the number of bits to be used for quantization, the coder makes use of the

psychoacoustic model. The inputs to the model include the fast Fourier transform (FFT) of

the audio data as well as the signal itself. The model calculates the masking thresholds in each

subband, which in turn determine the amount of quantization noise that can be tolerated and

hence the quantization step size. As the quantizers all cover the same range, selection of the

quantization step size is the same as selection of the number of bits to be used for quantizing the

output of each subband. In Layer I the encoder has a choice of 14 different quantizers for each

band (plus the option of assigning 0 bits). The quantizers are all midtread quantizers ranging

from 3 levels to 65,535 levels. Each subband gets assigned a variable number of bits. However,

the total number of bits available to represent all the subband samples is fixed. Therefore, the

bit allocation can be an iterative process. The objective is to keep the noise-to-mask ratio more

or less constant across the subbands.

The output of the quantization and bit allocation steps are combined into a frame as shown

in Figure 17.5 . Because MPEG audio is a streaming format, each frame carries a header, rather

than having a single header for the entire audio sequence. The header is made up of 32 bits.

The first 12 bits comprise a sync pattern consisting of all 1s. This is followed by a 1-bit version

ID, a 2-bit layer indicator, and 1 bit to indicate CRC protection. The CRC protection bit is set

to 0 if there is no CRC protection and is set to a 1 if there is CRC protection. If the layer and

protection information is known, all 16 bits can be used for providing frame synchronization.

The next 4 bits make up the bit rate index, which specifies the bit rate in kbits/sec. There are

14 specified bit rates to choose from. This is followed by 2 bits that indicate the sampling

frequency. The sampling frequencies for MPEG-1 and MPEG-2 are different (one of the few

differences between the audio coding standards for MPEG-1 and MPEG-2) and are shown in

Table 17.1 . These bits are followed by a single padding bit. If the bit is “1,” the frame needs

an additional bit to adjust the bit rate to the sampling frequency. The next two bits indicate the

mode. The possible modes are “stereo,” “joint stereo,” “dual channel,” and “single channel.”

The stereo mode consists of two channels that are encoded separately but intended to be played

Introduction to Data Compression

Search WWH ::

Custom Search

Home