Digital Signal Processing Reference
In-Depth Information
Fast
Slow
(a)
(b)
40
300
F1
F1
30
200
20
100
10
0
0
100
500
100
650
(c)
(d)
40
150
F2
F2
30
100
20
50
10
0
0
1400
2800
1400
2600
Frequency (Hz )
Fig. 5.2 Distribution of frame wise F 1 and F 2 values for fast and slow utterances; a F 1 for fast,
b F 1 for slow c F 2 for fast d F 2 forslowText: mAtA aur pitA kA Adar karnA chAhiye
Table 5.2 Classification of speech utterances based on the speaking rate using spectral features
Speaking rate
Recognition performance (%)
Super
Slow
Normal
Fast
Super
slow
fast
Super-slow
53
33
14
0
0
Slow
20
63
10
07
00
Normal
00
00
97
03
00
Fast
00
00
03
97
00
Super-fast
00
00
00
00
100
Average emotion recognition: 82%
5.3 Two Stage Emotion Recognition System
In this work, speech emotion recognition is carried out in two stages. In the first
stage, the emotions are grouped into three broad groups, namely, active, normal
and passive corresponding to fast, normal, and slow speaking rates. Generally, active
emotions are enthusiastically expressed with more energy, whereas passive emotions
are expressed with dull mood especially with less intensity. Categorizing emotions in
these three broad groups is known as gross level emotion classification. In the second
 
 
Search WWH ::




Custom Search