Audio compression - The MPEG

Information Technology Reference

In-Depth Information

The main profile requires the most complex encoder which makes use of all the coding tools. The low-complexity

(LC) profile omits certain tools and restricts the power of others to reduce processing and memory requirements.

The remaining tools in LC profile coding are identical to those in main profile such that a main profile decoder can

decode LC profile bitstreams. The scaleable sampling rate (SSR) profile splits the input audio into four equal

frequency bands each of which results in a self-contained bitstream. A simple decoder can decode only one, two or

three of these bitstreams to produce a reduced bandwidth output. Not all the AAC tools are available to SSR

profile.

The increased complexity of AAC allows the introduction of lossless coding tools. These allow a lower bit rate for

the same or improved quality at a given bit rate where the reliance on lossy coding is reduced. There is greater

attention given to the interplay between time-domain and frequency-domain precision in the human hearing

system.

Figure 4.40 shows a block diagram of an AAC main profile encoder. The audio signal path is straight through the

centre. The formatter assembles any side chain data along with the coded audio data to produce a compliant

bitstream. The input signal passes to the filter bank and the perceptual model in parallel. The filter bank consists of

a 50 per cent overlapped critically sampled MDCT which can be switched between block lengths of 2048 and 256

samples. At 48 kHz the filter allows resolutions of 23 Hz and 21 ms or 187 Hz and 2.6 ms. As AAC is a

multichannel coding system, block length switching cannot be done indiscriminately as this would result in loss of

block phase between channels. Consequently if short blocks are selected, the coder will remain in short block

mode for integer multiples of eight blocks. This is shown in Figure 4.41 which also shows the use of transition

windows between the block sizes as was done in Layer III.

Figure 4.40: The AAC encoder. Signal flow is from left to right whereas side-chain data flow is vertical.

Figure 4.41: In AAC short blocks must be used in multiples of 8 so that the long block phase is undisturbed. This

keeps block synchronism in multichannel systems.

The shape of the window function interferes with the frequency selectivity of the MDCT. In AAC it is possible to

select either a sine window or a Kaiser-Bessel-derived (KBD) window as a function of the input audio spectrum. As

was seen in Chapter 3 , filter windows allow different compromises between bandwidth and rate of roll-off. The KBD

window rolls off later but is steeper and thus gives better rejection of frequencies more than about 200 Hz apart

whereas the sine window rolls off earlier but less steeply and so gives better rejection of frequencies less than 70

Hz.

Following the filter bank is the intra-block predictive coding module. When enabled this module finds redundancy

between the coefficients within one transform block. In Chapter 3 the concept of transform duality was introduced,

in which a certain characteristic in the frequency domain would be accompanied by a dual characteristic in the time

domain and vice versa. Figure 4.42 shows that in the time domain, predictive coding works well on stationary

signals but fails on transients. The dual of this characteristic is that in the frequency domain, predictive coding

works well on transients but fails on stationary signals.

The MPEG

Search WWH ::

Custom Search

Home