Robust Emotion Recognition using Speaking Rate Features - Robust Emotion Recognition Using Spectral and Prosodic Features

Digital Signal Processing Reference

In-Depth Information

5.6 Summary

In this chapter, to resolve the classification ambiguity of highly confusing emotions, a

two-stage classification approach has been proposed to enhance the emotion recog-

nition performance. In this two-stage approach, the combination of spectral and

prosodic features has been employed. In the first stage, eight emotions are classi-

fied into three broader categories namely active, normal and passive based on the

speaking rate. In the second stage, within a broad group, emotions are classified

into the individual category. It has been observed that, after the first stage, emotion

classification performance is very high. The proposed two stage classification has

considerably improved the emotion recognition performance. This method demon-

strated the multi-stage emotion classification approach with feature combination.

References

1. S.G. Koolagudi, K.S. Rao, Two stage emotion recognition based on speaking rate. Int. J. Speech

Technol. 14 , 35-48 (2011)

2. S.G. Koolagudi, S. Ray, K.S. Rao, Emotion classification based on speaking rate, in Commu-

nications in Computer and Information Science , ed. by S. Ranka, A. Banerjee, K.K. Biswas,

S. Dua, P. Mishra, R. Moona, S.-H. Poon, C.-L. Wang. International Conference on Contem-

porary Computing, vol. 94, pp. 316-327, Springer, USA, 6-8 Aug 2010

3. K.S. Rao, B. Yegnanarayana, Modeling durations of syllables using neural networks. Comput.

Speech Lang. 21 , 282-295 (2007)

4. A.L. Francis, H.C. Nusbaum, Paying attention to speaking rate, in Fourth International Con-

ference on Spoken Language, 1996 ICSLP 96 , (Philadelphia, PA, USA), pp. 1537-1540 (V3),

IEEE, October 1996. Center for Computational Psychology, Department of Psychology, The

University of Chicago

5. J. Yuan, M. Liberman, C. Cieri, Towards an integrated understanding of speaking rate in

conversation, in Interspeech 2006 , (Pittsburgh, PA, 2006), pp. 541-544

6. M.S.H. Reddy, K.S. Kumar, S. Guruprasad, B. Yegnanarayana, Subsegmental features for

analysis of speech at different speaking rates, in International Conference on Natural Language

Processing , (Macmillan, India, 2009), pp. 75-80

7. A. LI, Y. ZU, Speaking rate effects on discourse prosody in standard chinese, in Fourth Inter-

national Conference on Speech Prosody , (Campinas, Brazil, 2008), pp. 449-452, 6-9 May

2008

8. H. Yang, W. Guo, Q. Liang, A speaking rate adjustable digital speech repeater for listening

comprehension in second-language learning, in International Conference on Computer Science

and, Software Engineering , vol. 5, pp. 893-896, 12-14 Dec 2008

9. S.G. Koolagudi, S. Maity, V.A. Kumar, S. Chakrabarti, K.S. Rao, IITKGP-SESC : speech

database for emotion analysis. Communications in Computer and Information Science, JIIT

University, Noida, India: Springer, ISSN: 1865-0929 ed., 17-19 Aug 2009

10. E.F. Lussier, N. Morgan, Effects of speaking rate and word frequency on pronunciations in

convertional speech. Speech Commun. 29 , 137-158 (1999)

11. M. Richardson, M.Y. Hwang, A. Acero, X. Huang, Improvements on speech recognition for

fast talkers, in Eurospeech Conference , Sept 1999

Search WWH ::

Custom Search

Home