Information Technology Reference
In-Depth Information
Figure 4.21: The noise-to-masking ratio is derived as shown here.
Practical systems should have a finite NMR in order to give a degree of protection against difficult signals which
have not been anticipated and against the use of post-codec equalization or several tandem codecs which could
change the masking threshold. There is a strong argument that devices used for audio production should have a
greater NMR than consumer or program delivery devices.
[ 9 ]
Grewin, C. and Ryden, T., Subjective assessments on low bit-rate audio codecs. Proc. 10th. Int. Audio Eng. Soc.
Conf. , 91-102, New York: Audio Eng. Soc. (1991)
[ 10 ] Gilchrist, N.H.C., Digital sound: the selection of critical programme material and preparation of the recordings for
CCIR tests on low bit rate codecs. BBC Res. Dept. Rep. RD 1993/1
[ 11 ] Colomes, C. and Faucon, G., A perceptual objective measurement system (POM) for the quality assessment of
perceptual codecs. Presented at 96th Audio Eng. Soc. Conv. Amsterdam (1994), Preprint No. 3801 (P4.2)
4.12 The limits
There are of course, limits to all technologies. Eventually artifacts will be heard as the amount of compression is
increased which no amount of detailed modelling will remove. The ear is only able to perceive a certain proportion
of the information in a given sound. This could be called the perceptual entropy, [ 12 ] and all additional sound is
redundant or irrelevant. Data reduction works by removing the redundancy, and clearly an ideal system would
remove all of it, leaving only the entropy. Once this has been done, the masking capacity of the ear has been
reached and the NMR has reached zero over the whole band. Reducing the data rate further must reduce the
entropy, because raising noise further at any frequency will render it audible. In practice the audio bandwidth will
have to be reduced in order to keep the noise level acceptable. In MPEG-1 pre-filtering allows data from higher
sub-bands to be neglected. MPEG-2 has introduced some low sampling rate options for this purpose. Thus there is
a limit to the degree of data reduction which can be achieved even with an ideal coder. Systems which go beyond
that limit are not appropriate for high-quality music, but are relevant in news gathering and communications where
intelligibility of speech is the criterion. Interestingly, the data rate out of a coder is virtually independent of the input
sampling rate unless the sampling rate is very low. This is because the entropy of the sound is in the waveform, not
in the number of samples carrying it.
The compression factor of a coder is only part of the story. All codecs cause delay, and in general the greater the
compression the longer the delay. In some applications, such as telephony, a short delay is required. [ 13 ] In many
applications, the compressed channel will have a constant bit rate, and so a constant compression factor is
required. In real program material, the entropy varies and so the NMR will fluctuate. If greater delay can be
accepted, as in a recording application, memory buffering can be used to allow the coder to operate at constant
NMR and instantaneously variable data rate. The memory absorbs the instantaneous data rate differences of the
coder and allows a constant rate in the channel. A higher effective compression factor will then be obtained. Near-
constant quality can also be achieved using statistical multiplexing.
[ 12 ] Johnston, J., Estimation of perceptual entropy using noise masking criteria. ICASSP , 2524-2527 (1988)
[ 13 ] Gilchrist, N.H.C., Delay in broadcasting operations. Presented at 90th Audio Eng. Soc. Conv. (1991), Preprint
3033
4.13 Compression applications
One of the fundamental concepts of PCM audio is that the signal-to-noise ratio of the channel can be determined
by selecting a suitable wordlength. In conjunction with the sampling rate, the resulting data rate is then determined.
In many cases the full data rate can be transmitted or recorded. A high-quality digital audio channel requires
around one megabit per second, whereas a standard definition component digital video channel needs two
hundred times as much. With mild video compression the video bit rate is still so much in excess of the audio rate
 
Search WWH ::




Custom Search