Audio compression - The MPEG

Information Technology Reference

In-Depth Information

Figure 4.32 (b) shows the Layer II elementary stream structure. Following the sync pattern the bit-allocation data

are sent. The requantizing process of Layer II is more complex than in Layer I. The sub-bands are categorized into

three frequency ranges, low, medium and high, and the requantizing in each range is different. Low-frequency

samples can be quantized into 15 different wordlengths, mid-frequencies into seven different wordlengths and high

frequencies into only three different wordlengths. Accordingly the bit-allocation data uses words of four, three and

two bits depending on the sub-band concerned. This reduces the amount of allocation data to be sent. In each

case one extra combination exists in the allocation code. This is used to indicate that no data are being sent for

that sub-band.

The 1152 sample block of Layer II is divided into three blocks of 384 samples so that the same companding

structure as Layer I can be used. The 2 dB step size in the scale factors is retained. However, not all the scale

factors are transmitted, because they contain a degree of redundancy. In real program material, the difference

between scale factors in successive blocks in the same band exceeds 2 dB less than 10 per cent of the time. Layer

II coders analyse the set of three successive scale factors in each sub-band. On a stationary program, these will be

the same and only one scale factor out of three is sent. As the transient content increases in a given sub-band, two

or three scale factors will be sent. A two-bit code known as SCFSI (scale factor select information) must be sent to

allow the decoder to determine which of the three possible scale factors have been sent for each sub-band. This

technique effectively halves the scale factor bit rate.

As for Layer I, the requantizing process always uses an odd number of steps to allow a true centre zero step. In

long-wordlength codes this is not a problem, but when three, five or nine quantizing intervals are used, binary is

inefficient because some combinations are not used. For example, five intervals need a three-bit code having eight

combinations leaving three unused. The solution is that when three-, five- or nine-level coding is used in a sub-

band, sets of three samples are encoded into a granule . Figure 4.34 shows how granules work. Continuing the

example of five quantizing intervals, each sample could have five different values, therefore all combinations of

three samples could have 125 different values. As 128 values can be sent with a seven-bit code, it will be seen that

this is more efficient than coding the samples separately as three five- level codes would need nine bits. The three

requantized samples are used to address a look-up table which outputs the granule code. The decoder can

establish that granule coding has been used by examining the bitallocation data.

Figure 4.34: Codes having ranges smaller than a power of two are inefficient. Here three codes with a range of five

values which 3 bits can be carried in a single eight-bit × would ordinarily need 3 word.

The requantized samples/granules in each sub-band, bit allocation data, scale factors and scale factor select codes

are multiplexed into the output bit stream.

The Layer II decoder is shown in Figure 4.35 . This is not much more complex than the Layer I decoder. The

demultiplexing will separate the sample data from the side information. The bit-allocation data will specify the

wordlength or granule size used so that the sample block can be deserialized and the granules decoded. The scale

factor select information will be used to decode the compressed scale factors to produce one scale factor per block

of 384 samples. Inverse quantizing and inverse sub-band filtering takes place as for Layer I.

Figure 4.35: A Layer II decoder is slightly more complex than the Layer I decoder because of the need to decode

granules and scale factors.

The MPEG

Search WWH ::

Custom Search

Home