Information Technology Reference
In-Depth Information
tones at those frequencies can be made coarser. Fewer bits are needed and a coding gain results. The increased
quantizing distortion is allowable if it is masked by the presence of the first tone.
The functioning of the ear is noticeably level dependent and perceptive coders take this into account. However, all
signal processing takes place in the electrical or digital domain with respect to electrical or numerical levels
whereas the hearing mechanism operates with respect to true sound pressure level. Figure 4.20 shows that in an
ideal system the overall gain of the microphones and ADCs is such that the PCM codes have a relationship with
sound pressure which is the same as that assumed by the model in the codec. Equally the overall gain of the DAC
and loudspeaker system should be such that the sound pressure levels which the codec assumes are those
actually heard. Clearly the gain control of the microphone and the volume control of the reproduction system must
be calibrated if the hearing model is to function properly. If, for example, the microphone gain was too low and this
was compensated by advancing the loudspeaker gain, the overall gain would be the same but the codec would be
fooled into thinking that the sound pressure level was less than it really was and the masking model would not then
be appropriate.
Figure 4.20: Audio coders must be level calibrated so that the psychoacoustic decisions in the coder are based on
correct sound pressure levels.
The above should come as no surprise as analog audio codecs such as the various Dolby systems have required
and implemented line-up procedures and suitable tones. However obvious the need to calibrate coders may be, the
degree to which this is recognized in the industry is almost negligible to date and this can only result in sub-optimal
performance.
4.11 Quality measurement
As has been seen, one way in which coding gain is obtained is to requantize sample values to reduce the
wordlength. Since the resultant requantizing error is a distortion mechanism it results in energy moving from one
frequency to another. The masking model is essential to estimate how audible the effect of this will be. The greater
the degree of compression required, the more precise the model must be. If the masking model is inaccurate, then
equipment based upon it may produce audible artifacts under some circumstances. Artifacts may also result if the
model is not properly implemented. As a result, development of audio compression units requires careful listening
tests with a wide range of source material [ 9 ] [ 10 ] and precision loudspeakers. The presence of artifacts at a given
compression factor indicates only that performance is below expectations; it does not distinguish between the
implementation and the model. If the implementation is verified, then a more detailed model must be sought.
Naturally comparative listening tests are only valid if all the codecs have been level calibrated and if the
loudspeakers cause less loss of information than any of the codecs, a requirement which is frequently overlooked.
Properly conducted listening tests are expensive and time consuming, and alternative methods have been
developed which can be used objectively to evaluate the performance of different techniques. The noise-to-
masking ratio (NMR) is one such measurement. [ 11 ] Figure 4.21 shows how NMR is measured. Input audio signals
are fed simultaneously to a data reduction coder and decoder in tandem and to a compensating delay whose
length must be adjusted to match the codec delay. At the output of the delay, the coding error is obtained by
subtracting the codec output from the original. The original signal is spectrum-analysed into critical bands in order
to derive the masking threshold of the input audio, and this is compared with the critical band spectrum of the error.
The NMR in each critical band is the ratio between the masking threshold and the quantizing error due to the
codec. An average NMR for all bands can be computed. A positive NMR in any band indicates that artifacts are
potentially audible. Plotting the average NMR against time is a powerful technique, as with an ideal codec the NMR
should be stable with different types of program material. If this is not the case the codec could perform quite
differently as a function of the source material. NMR excursions can be correlated with the waveform of the audio
input to analyse how the extra noise was caused and to redesign the codec to eliminate it.
 
Search WWH ::




Custom Search