Digital Signal Processing Reference
In-Depth Information
present in a signal. There are primarily three types of redundancies present in
an audio signal that may be exploited.
Statistical redundancy In most audio signals, samples with lower magnitudes
have a higher probability of occurrence than samples with higher magnitude. In
such cases, an entropy coding scheme, such as the Huffman code, can be used
to allocate fewer bits to frequently occurring values and a higher number of
bits to the other values. This reduces the bit rate for representing audio signals
when compared with a coding scheme with an equal number of bits allocated
per sample.
Temporal redundancy Neighboring audio samples typically have a strong
correlation between themselves such that the value of a sample can be predicted
with fairly high accuracy from the last few sample values. Predictive coding
schemes exploit this temporal redundancy by subtracting the predicted value
from the actual sample value. The resulting difference signal is then compressed
using an entropy based coding scheme, such as the dictionary or Huffman codes.
Psychoacoustics redundancy There are many idiosyncrasies in the human
auditory system. For example, the sensitivity of the human auditory system is
maximum for frequencies within the 2000-4000 Hz band and the sensitivity
decreases above or below this band. In addition, a strong frequency component
masks the neighboring weaker frequency components. The unequal frequency
sensitivity and masking properties are exploited to compress the audio.
In the following section, we present a simplified audio compression technique,
known as the differential pulse-code modulation (DPCM) technique. To achieve
compression, the DPCM reduces the temporal redundancy present in an audio
signal.
17.4.1 Differential pulse-code modulation
Most audio signals encoded with pulse-code modulation (PCM) exhibit a strong
correlation between neighboring samples. This is especially true if the signal is
sampled above the Nyquist sampling rate. Figure 17.15 plots 30 samples of an
audio signal stored in the chord.wav file. We observe that the neighboring
samples are correlated such that their values are fairly close to each other. In
DPCM, an audio sample s [ k ] is predicted from the past samples. An M -order
predictor calculates the predicted value of an audio sample at time instant k
using the following equation:
s [ k ] = M
α m s [ k m ] ,
(17.15)
m = 1
where s [ k - m ] is the value of the audio sample at time instant k m and α m
are the predictor coefficients. The DPCM encoder quantizes the prediction
Search WWH ::




Custom Search