Digital Signal Processing Reference
In-Depth Information
give priority to some system parameters over others. In early speech coders,
which aimed at reproducing the input speech waveform as output, objective
measurement in the form of signal to quantization noise ratio was used.
Since the bit rate of early speech coders was 16 kb/s or greater (i.e. they
incurred only a small amount of quantization noise) and they did not involve
complicated signal processing algorithms which could change the shape of
the speech waveform, the SNR measures were reasonably accurate. However
at lower bit rates where the noise (the objective difference between the original
input and the synthetic output) increases, the use of signal to quantization
noise ratio may be misleading. Hence there is a need for a better objective
measurement which has a good correlation with the perceptual quality of the
synthetic speech. The ITU standardized a number of these methods, the most
recent of which is P.862 (or Perceptual Evaluation of Speech Quality). In this
standard, various alignments and perceptual measures are used to match the
objective results to fairly accurate subjective MOS scores.
2.5 Summary
Existing speech coders can be divided into three groups: parametric coders,
waveform approximating coders, and hybrid coders. Parametric coders are
not expected to reproduce the original waveform; they reproduce the per-
ception of the original. Waveform approximating coders, on the other hand,
are expected to replicate the input speech waveform as the bit rate increases.
Hybrid coding is a combination of two or more coders of any type for the
best subjective (and perhaps objective) performance at a given bit rate.
The design process of a speech coder involves several trade-offs between
conflicting requirements. These requirements include the target bit rate, qual-
ity, delay, complexity, channel error sensitivity, and sending of nonspeech
signals. Various standardization bodies have been involved in speech coder
standardization activities and as a result there have been many standard
speech coders in the last decade. The bit rate of these coders ranges from
16 kb/s down to around 4 kb/s with target applications mainly in cellular
mobile radio. The selection of a speech coder involves expensive testing under
the expected typical operating conditions. The most popular testingmethod is
subjective listening tests. However, as this is expensive and time-consuming,
there has been some effort to produce simpler yet reliable objective measures.
ITU P.862 is the latest effort in this direction.
Bibliography
[1] W. B. Kleijn and K. K. Paliwal (1995) 'An introduction to speech coding',
in Speech coding and synthesis by W. B. Kleijn and K. K. Paliwal (Eds),
pp. 1-47. Amsterdam: Elsevier Science
Search WWH ::




Custom Search