Digital Signal Processing Reference
In-Depth Information
4.5 Summary
Speech signal is a highly-correlated signal which possesses both short- and
long-term similarities. These similarities or redundancies can easily be mod-
elled by very compact LPC and pitch filter formulations. The redundancies
are usually removed at the analysis stage so as to reduce the bit rate required
for transmitting the remaining residual signal. During the analysis of speech
to obtain the short- and long-term filter coefficients, reasonable lengths of
samples are needed, which introduces some delay into the analysis process.
A typical block length of samples required for good analysis performance
is around 20-30ms which corresponds to 160-240 samples at 8 kHz sam-
pling. The assumption is that the samples contained in the block do not vary
significantly and hence can be analysed reasonable accurately. A 10 th -order
short-term LPC filter updated every 20ms and a single-order long-term pitch
filter updated every 5ms give good performance.
Bibliography
[1] L. Rabiner and R. Schafer (1978) Digital Processing of Speech Signals .
Englewood Cliffs, NJ: Prentice-Hall
[2] B. Atal and M. Schroeder (1970) 'Adaptive predictive coding of speech
signals', in Bell Sys. Technical Journal , pp. 1973-87. October 1970.
[3] J. Makhoul (1975) 'Linear prediction: A tutorial review', in Proc. of IEEE ,
63:561-80.
[4] C. McGonegal, L. R. Rabiner and A. E. Rosenberg (1977) 'A subjective
evaluation of pitch detection methods using LPC synthesised speech', in
IEEE Trans. on Acoust., Speech and Signal Processing , 25(3):221-9.
[5] R. P. Ramachandran and P. Kabal (1989) 'Pitch prediction filters in
speech coding', in IEEE Trans. On Acoust., Speech and Signal Processing ,
37:467-78.
[6] M. Srinath and P. Rajasekaran (1979) An Introduction to Statistical Signal
Processing with Applications . John Wiley & Sons Ltd
[7] S. Saito and K. Nakata (1985) Fundamentals of Speech Signal Processing ,
Chapter 9. Academic Press
[8] J. Makhoul (1977) 'Stable and efficient lattice methods for linear predic-
tion', in IEEE Trans. on Acoust., Speech and Signal Processing , 25:423-8.
[9] J. H. Chen (1990) 'High quality 16 kbit/s speech coding with a one-way
delay less than 2ms', in Proc. of Int. Conf. on Acoust., Speech and Signal
Processing , pp. 453-6.
[10] M. Schroeder and B. Atal (1979) 'Predictive coding of speech signals
and subjective error criteria', in IEEE Trans. on Acoust., Speech and Signal
Processing , 27:247-54.
Search WWH ::




Custom Search