Digital Signal Processing Reference
In-Depth Information
Table 3.6 English transcriptions of the Telugu text prompts of IITKGP-SESC
Sentence identity
Text prompts
S1
thallidhandrulanu gauravincha valenu
S2
mI kOsam chAlA sEpatnimchi chUsthunnAmu
S3
samAjamlo prathi okkaru chadhuvuko valenu
S4
ellappudu sathyamune paluka valenu
S5
I rOju nEnu tenali vellu chunnAnu
S6
kOpamunu vIdi sahanamunu pAtincha valenu
S7
anni dAnamulalo vidyA dAnamu minnA
S8
uchitha salahAlu ivvarAdhu
S9
dongathanamu cheyutA nEramu
S10
I rOju vAthAvaranamu podigA undhi
S11
dEsa vAsulandharu samaikhyAthA tho melaga valenu
S12
mana rAshtra rAjadhAni hyderAbAd
S13
sangha vidhrOha sekthulaku Ashrayam kalpincharAdhu
S14
thelupu rangu shAnthiki chihnamu
S15
gangA jalamu pavithra mainadhi
the syllable level, is fixed at 4, which is equal to the maximum number of syllables
in any group, as shown in Table 3.5 .
Out of 10 speakers of IITKGP-SESC, the speech utterances of eight speakers
(4 male and 4 female) are used to train the emotion recognition models. Validation
of the trained models is done using remaining 2 speakers' (1 male and 1 female)
speech data. The details of the speech corpus, IITKGP-SESC, are given in Sect. 2.2
of Chap. 2 . The description of development of emotion recognition models and their
verification is discussed in the next Section.
3.5 Results and Discussion
Emotion recognition systems are separately developed for sentence, word, and
syllable level global and local level prosodic features. The combination of global
and local level features is also explored to study emotion recognition (ER).
3.5.1 Emotion Recognition Systems using Sentence Level
Prosodic Features
In this work, we have considered 8 emotions of IITKGP-SESC, for studying the
role of global and local prosodic features in recognizing speech emotions. SVMs are
used to develop emotion recognition models. Each SVM is trained with positive and
 
 
Search WWH ::




Custom Search