Information Technology Reference
In-Depth Information
Chapter 4: Audio compression
4.1 Introduction
Physics can tell us the mechanism by which disturbances propagate through the air. If this is our definition of
sound, we have the problem that in physics there are no limits to the frequencies and levels which must be
considered. Biology can tell that the ear only responds to a certain range of frequencies provided a threshold level
is exceeded. This is a better definition of sound; reproduction is easier because it is only necessary to reproduce
that range of levels and frequencies which the ear can detect.
Psychoacoustics can describe how our hearing has finite resolution in time, frequency and spatial domains such
that what we perceive is an inexact impression. Some aspects of the original disturbance are inaudible to us and
are said to be masked. If our goal is the highest quality, we can design our imperfect equipment so that the
shortcomings are masked. Conversely if our goal is economy we can use compression and hope that masking will
disguise the inaccuracies it causes.
By definition, the sound quality of a perceptive coder can only be assessed by human hearing. Equally, a useful
perceptive coder can only be designed with a good knowledge of the human hearing mechanism. [ 1 ] The acuity of
the human ear is astonishing. The frequency range is extremely wide, covering some ten octaves (an octave is a
doubling of pitch or frequency) without interruption. It can detect tiny amounts of distortion, and will accept an
enormous dynamic range. If the ear detects a different degree of impairment between two codecs having the same
bit rate in properly conducted tests, we can say that one of them is superior.
Quality is completely subjective and can only be checked by listening tests, although these are meaningless if the
loudspeakers are not of sufficient quality. However, any characteristic of a signal which can be heard can also be
measured by a suitable instrument. The subjective tests can tell us how sensitive the instrument should be. Then
the objective readings from the instrument give an indication of how acceptable a signal is in respect of that
characteristic. Instruments and loudspeakers suitable for assessing the performance of codecs are currently
extremely rare and there remains much work to be done.
[ 1 ] Johnston, J.D., Transform coding of audio signals using perceptual noise criteria. IEEE J. Selected Areas in
Comms. , JSAC-6 , 314-323 (1988)
4.2 The deciBel
The first audio signals to be transmitted were on analog telephone lines. Where the wiring is long compared to the
electrical wavelength (not to be confused with the acoustic wavelength) of the signal, a transmission line exists in
which the distributed series inductance and the parallel capacitance interact to give the line a characteristic
impedance. In telephones this turned out to be about 600x. In transmission lines the best power delivery occurs
when the source and the load impedance are the same; this is the process of matching. It was often required to
measure the power in a telephone system, and 1 milliWatt was chosen as a suitable unit. Thus the reference
against which signals could be compared was the dissipation of one milliWatt in 600x. Figure 4.1 shows that the
dissipation of 1 mW in 600xwill be due to an applied voltage of 0.775 V rms. This voltage is the reference against
which all audio levels are compared.
Figure 4.1: (a) Ohm's law: the power developed in a resistor is proportional to the square of the voltage.
 
Search WWH ::




Custom Search