Digital Signal Processing Reference
In-Depth Information
and some errors are inevitable. Overall performance of the pitch-estimation
algorithms, however, can be considered to be pretty good. Voiced - unvoiced
classification, on the other hand, has moved from a single (binary) indicator,
where each block of speech was classified either as voiced or unvoiced, to
more elaborate frequency-domain mixed decisions. This has increased the
quality of synthetic speech dramatically. The performance of voicing estima-
tion under noisy conditions has also been improved with developments in
mixed-voicing classification.
Bibliography
[1] L. R. Rabiner, M. J. Cheng, A. E. Rosenberg, and C. A. McGonegal (1976)
'A comparative performance study of several pitch detection algorithms',
in IEEE Trans. on Acoust., Speech and Signal Processing , 24(5):399 - 418.
[2] W. J. Hess (1992) 'Pitch and voicing determination', in Advances in Speech
Signal Processing by S. Furui and M. M. Sondhi (Eds), pp. 3 - 48. New
York: Marcel Dekker Inc.
[3] L. Rabiner (1977) 'On the use of autocorrelation analysis for pitch detec-
tion', in IEEE Trans. on Acoust., Speech and Signal Processing , 25(1):24 - 33.
[4] M J. Ross, H. L. Shaffer, A. Cohen, R. Freudberg, and H. J. Manley (1974)
'Average magnitude difference function pitch extractor', in IEEE Trans.
on Acoust., Speech and Signal Processing , 22(5):353 - 62.
[5] C. K. Un and S.-H. Yang (1977) 'A pitch extraction algorithm based on
LPC inverse filtering and AMDF', in IEEE Trans. on Acoust., Speech and
Signal Processing , 25(6):565 - 72.
[6] ITU-T (1996) Dual rate speech coder for multimedia communications trans-
mitting at 5.3 and 6.3 kbit/s , ITU-T Rec. G.723.1.
[7] ITU-T (1996) Coding of speech at 8 kbit/s using conjugate-structure algebraic-
code-excited linear prediction (CS-ACELP) , ITU-T Rec. G.729.
[8] ETSI (1997) Digital cellular telecommunications system (phase 2
); Half rate
speech; Half rate speech transcoding , GSM 06.20 v5.1.0 (draft ETSI ETS 300
969).
[9] ETSI (1998) Digital cellular telecommunications system (phase 2); Enhanced
full rate (EFR) speech transcoding , GSM 06.60 v4.1.0 (ETS 301 245), June.
[10] ETSI (1998) Digital cellular telecommunications system (phase 2
+
); Adaptive
multi-rate (AMR) speech transcoding , GSM 06.90 v7.2.0 (draft ETSI EN 301
704).
[11] FIPS (1997) Analog to digital conversion of voice by 2,400 bit/second mixed
excitation linear prediction (MELP) , Draft. Federal Information Processing
Standards
[12] A. M. Noll (1967) 'Cepstrum pitch determination', in Journal of the Acoustic
Soc. of America , 41:293 - 309.
+
Search WWH ::




Custom Search