Digital Signal Processing Reference
In-Depth Information
0.4
s = 81 Hz
m = 145 Hz;
Neutral M+F
TellMe M+F
AA M+F
m = 161 Hz;
s = 80 Hz
m = 177 Hz;
s = 94 Hz
0.2
0
0
200
400
600
Fundamental Frequency(Hz)
Fig. 1.5 Distribution of fundamental frequency in neutral, Tell-Me, and AA sessions
Table 1.2 Formant center frequencies and bandwidths (in parentheses)
Formants and bandwidths (Hz)
F1
Gender
Scenario
F2
F3
F4
F
Neutral
555
1,625
2,865
4,012
(219)
(247)
(312)
(327)
Tell-Me
703
1,612
2,836
3,855
(308)
(276)
(375)
(346)
AA
710
1,667
2,935
4,008
(244)
(243)
(325)
(329)
M
Neutral
450
1,495
2,530
3,763
(188)
(209)
(342)
(343)
Tell-Me
472
1,498
2,525
3,648
(205)
(214)
(341)
(302)
AA
503
1,526
2,656
3,654
(188)
(215)
(330)
(369)
In the next step, speech production parameters are analyzed. Distributions of
fundamental frequency in passenger conversations (denoted Neutral ), and Tell-Me
and AA conversations are depicted in Fig. 1.5 , where M+F stands for mixed-
gender data sets. Both Tell-Me and AA samples display a consistent increase in
mean fundamental frequency (177 Hz and 161 Hz) compared to neutral (145 Hz).
Mean center frequencies and bandwidths of the first four formants were extracted
from voiced speech segments using WaveSurfer [ 17 ]. They are compared for neutral,
Tell-Me, and AA conversations in Table 1.2 . The voiced segments were identified
based on the output of the pitch tracking algorithm implemented in [ 17 ](RAPT[ 18 ]).
Mean center frequencies and standard deviations of F1 are displayed in Fig. 1.6 .
A consistent increase in F1 can be observed for Tell-Me and AA data. In AA,
Search WWH ::




Custom Search