Digital Signal Processing Reference
In-Depth Information
and some errors are inevitable. Overall performance of the pitch-estimation
algorithms, however, can be considered to be pretty good. Voiced - unvoiced
classification, on the other hand, has moved from a single (binary) indicator,
where each block of speech was classified either as voiced or unvoiced, to
more elaborate frequency-domain mixed decisions. This has increased the
quality of synthetic speech dramatically. The performance of voicing estima-
tion under noisy conditions has also been improved with developments in
mixed-voicing classification.
Bibliography
[1] L. R. Rabiner, M. J. Cheng, A. E. Rosenberg, and C. A. McGonegal (1976)
'A comparative performance study of several pitch detection algorithms',
in
IEEE Trans. on Acoust., Speech and Signal Processing
, 24(5):399 - 418.
[2] W. J. Hess (1992) 'Pitch and voicing determination', in
Advances in Speech
Signal Processing
by S. Furui and M. M. Sondhi (Eds), pp. 3 - 48. New
York: Marcel Dekker Inc.
[3] L. Rabiner (1977) 'On the use of autocorrelation analysis for pitch detec-
tion', in
IEEE Trans. on Acoust., Speech and Signal Processing
, 25(1):24 - 33.
[4] M J. Ross, H. L. Shaffer, A. Cohen, R. Freudberg, and H. J. Manley (1974)
'Average magnitude difference function pitch extractor', in
IEEE Trans.
on Acoust., Speech and Signal Processing
, 22(5):353 - 62.
[5] C. K. Un and S.-H. Yang (1977) 'A pitch extraction algorithm based on
LPC inverse filtering and AMDF', in
IEEE Trans. on Acoust., Speech and
Signal Processing
, 25(6):565 - 72.
[6] ITU-T (1996)
Dual rate speech coder for multimedia communications trans-
mitting at 5.3 and 6.3 kbit/s
, ITU-T Rec. G.723.1.
[7] ITU-T (1996)
Coding of speech at 8 kbit/s using conjugate-structure algebraic-
code-excited linear prediction (CS-ACELP)
, ITU-T Rec. G.729.
[8] ETSI (1997)
Digital cellular telecommunications system (phase 2
); Half rate
speech; Half rate speech transcoding
, GSM 06.20 v5.1.0 (draft ETSI ETS 300
969).
[9] ETSI (1998)
Digital cellular telecommunications system (phase 2); Enhanced
full rate (EFR) speech transcoding
, GSM 06.60 v4.1.0 (ETS 301 245), June.
[10] ETSI (1998)
Digital cellular telecommunications system (phase 2
+
); Adaptive
multi-rate (AMR) speech transcoding
, GSM 06.90 v7.2.0 (draft ETSI EN 301
704).
[11] FIPS (1997)
Analog to digital conversion of voice by 2,400 bit/second mixed
excitation linear prediction (MELP)
, Draft. Federal Information Processing
Standards
[12] A. M. Noll (1967) 'Cepstrum pitch determination', in
Journal of the Acoustic
Soc. of America
, 41:293 - 309.
+
Search WWH ::
Custom Search