Speech Signal Analysis and Modelling - Digital Speech: Coding for Low Bit Rate Communication Systems

Digital Signal Processing Reference

In-Depth Information

4.5 Summary

Speech signal is a highly-correlated signal which possesses both short- and

long-term similarities. These similarities or redundancies can easily be mod-

elled by very compact LPC and pitch filter formulations. The redundancies

are usually removed at the analysis stage so as to reduce the bit rate required

for transmitting the remaining residual signal. During the analysis of speech

to obtain the short- and long-term filter coefficients, reasonable lengths of

samples are needed, which introduces some delay into the analysis process.

A typical block length of samples required for good analysis performance

is around 20-30ms which corresponds to 160-240 samples at 8 kHz sam-

pling. The assumption is that the samples contained in the block do not vary

significantly and hence can be analysed reasonable accurately. A 10 th -order

short-term LPC filter updated every 20ms and a single-order long-term pitch

filter updated every 5ms give good performance.

Bibliography

[1] L. Rabiner and R. Schafer (1978) Digital Processing of Speech Signals .

Englewood Cliffs, NJ: Prentice-Hall

[2] B. Atal and M. Schroeder (1970) 'Adaptive predictive coding of speech

signals', in Bell Sys. Technical Journal , pp. 1973-87. October 1970.

[3] J. Makhoul (1975) 'Linear prediction: A tutorial review', in Proc. of IEEE ,

63:561-80.

[4] C. McGonegal, L. R. Rabiner and A. E. Rosenberg (1977) 'A subjective

evaluation of pitch detection methods using LPC synthesised speech', in

IEEE Trans. on Acoust., Speech and Signal Processing , 25(3):221-9.

[5] R. P. Ramachandran and P. Kabal (1989) 'Pitch prediction filters in

speech coding', in IEEE Trans. On Acoust., Speech and Signal Processing ,

37:467-78.

[6] M. Srinath and P. Rajasekaran (1979) An Introduction to Statistical Signal

Processing with Applications . John Wiley & Sons Ltd

[7] S. Saito and K. Nakata (1985) Fundamentals of Speech Signal Processing ,

Chapter 9. Academic Press

[8] J. Makhoul (1977) 'Stable and efficient lattice methods for linear predic-

tion', in IEEE Trans. on Acoust., Speech and Signal Processing , 25:423-8.

[9] J. H. Chen (1990) 'High quality 16 kbit/s speech coding with a one-way

delay less than 2ms', in Proc. of Int. Conf. on Acoust., Speech and Signal

Processing , pp. 453-6.

[10] M. Schroeder and B. Atal (1979) 'Predictive coding of speech signals

and subjective error criteria', in IEEE Trans. on Acoust., Speech and Signal

Processing , 27:247-54.

Search WWH ::

Custom Search

Home