Information Technology Reference
In-Depth Information
[ 17 ] Princen, J.P., Johnson, A. and Bradley, A.B., Sub-band/transform coding using filter bank designs based on time
domain aliasing cancellation. Proc. ICASSP , 2161-2164 (1987)
4.15 Sub-band coding
Sub-band coding takes advantage of the fact that real sounds do not have uniform spectral energy. The wordlength
of PCM audio is based on the dynamic range required and this is generally constant with frequency although any
pre-emphasis will affect the situation. When a signal with an uneven spectrum is conveyed by PCM, the whole
dynamic range is occupied only by the loudest spectral component, and all of the other components are coded with
excessive headroom. In its simplest form, sub-band coding works by splitting the audio signal into a number of
frequency bands and companding each band according to its own level. Bands in which there is little energy result
in small amplitudes which can be transmitted with short wordlength. Thus each band results in variable- length
samples, but the sum of all the sample wordlengths is less than that of PCM and so a coding gain can be obtained.
Sub-band coding is not restricted to the digital domain; the analog Dolby noise-reduction systems use it
extensively.
The number of sub-bands to be used depends upon what other compression tools are to be combined with the
sub-band coding. If it is intended to optimize compression based on auditory masking, the subbands should
preferably be narrower than the critical bands of the ear, and therefore a large number will be required. This
requirement is frequently not met: ISO/MPEG Layers I and II use only 32 sub-bands. Figure 4.27 shows the critical
condition where the masking tone is at the top edge of the sub-band. It will be seen that the narrower the sub-band,
the higher the requantizing 'noise' that can be masked. The use of an excessive number of sub-bands will,
however, raise complexity and the coding delay, as well as risking pre-ringing on transients which may exceed the
temporal masking.
Figure 4.27: In sub-band coding the worst case occurs when the masking tone is at the top edge of the sub-band.
The narrower the band, the higher the noise level which can be masked.
The bandsplitting process is complex and requires a lot of computation. One bandsplitting method which is useful is
quadrature mirror filtering described in Chapter 3 .
4.17 MPEG audio compression
The subject of audio compression was well advanced when the MPEG/ Audio group was formed. As a result it was
not necessary for the group to produce ab initio codecs because existing work was considered suitable.
As part of the Eureka 147 project, a system known as MUSICAM [ 19 ] (Masking pattern adapted Universal Sub-band
Integrated Coding And Multiplexing) was developed jointly by CCETT in France, IRT in Germany and Philips in the
Netherlands. MUSICAM was designed to be suitable for DAB (digital audio broadcasting). As a parallel
development, the ASPEC [ 20 ] (Adaptive Spectral Perceptual Entropy Coding) system was developed from a number
of earlier systems as a joint proposal by AT&T Bell Labs, Thomson, the Fraunhofer Society and CNET. ASPEC
was designed for use at high compression factors to allow audio transmission on ISDN.
These two systems were both fully implemented by July 1990 when comprehensive subjective testing took place at
the Swedish Broadcasting Corporation. [ 21 ] [ 22 ] As a result of these tests, the MPEG/Audio group combined the
attributes of both ASPEC and MUSICAM into a standard having having three levels of complexity and
performance.
[1] ISO/IEC JTC1/SC29/WG11 MPEG, International standard ISO 11172-3 Coding of moving pictures and
associated audio for digital storage media up to 1.5 Mbits/s, Part 3: Audio (1992)
[2] Brandenburg, K. and Stoll, G., ISO-MPEG-1 Audio: A generic standard for coding of high quality audio. JAES ,
42 , 780-792 (1994)
These three different levels, which are known as layers , are needed because of the number of possible
applications. Audio coders can be operated at various compression factors with different quality expectations.
Stereophonic classical music requires different quality criteria from monophonic speech. The complexity of the
 
Search WWH ::




Custom Search