Robust Emotion Recognition using Speaking Rate Features - Robust Emotion Recognition Using Spectral and Prosodic Features - page 87

Digital Signal Processing Reference

In-Depth Information

Fast

Slow

(a)

(b)

40

300

F1

F1

30

200

20

100

10

0

0

100

500

100

650

(c)

(d)

40

150

F2

F2

30

100

20

50

10

0

0

1400

2800

1400

2600

Frequency (Hz )

Fig. 5.2 Distribution of frame wise F 1 and F 2 values for fast and slow utterances; a F 1 for fast,

b F 1 for slow c F 2 for fast d F 2 forslowText: mAtA aur pitA kA Adar karnA chAhiye

Table 5.2 Classification of speech utterances based on the speaking rate using spectral features

Speaking rate

Recognition performance (%)

Super

Slow

Normal

Fast

Super

slow

fast

Super-slow

53

33

14

0

0

Slow

20

63

10

07

00

Normal

00

00

97

03

00

Fast

00

00

03

97

00

Super-fast

00

00

00

00

100

Average emotion recognition: 82%

5.3 Two Stage Emotion Recognition System

In this work, speech emotion recognition is carried out in two stages. In the first

stage, the emotions are grouped into three broad groups, namely, active, normal

and passive corresponding to fast, normal, and slow speaking rates. Generally, active

emotions are enthusiastically expressed with more energy, whereas passive emotions

are expressed with dull mood especially with less intensity. Categorizing emotions in

these three broad groups is known as gross level emotion classification. In the second

Next Page

Robust Emotion Recognition Using Spectral and Prosodic Features

Search WWH ::

Custom Search

Home