Digital Signal Processing Reference
In-Depth Information
Table 10.27 Intoxication and sleepiness baseline results by UA and WA. SMO learnt pairwise
SVM with linear Kernel, complexity optimised on development partition to 0.01 (intoxication) and
0.02 (sleepiness)
[%]
Intoxication
Sleepiness
Features
UA
WA
UA
WA
Train versus develop
IS 2009 EC
57.4
65.3
65.3
64.2
IS 2010 PC
61.6
66.1
65.1
66.4
IS 2011 SSC
65.3
69.2
67.3
69.1
Train + develop versus test
IS 2009 EC 60.3 60.2 68.0 72.4
IS 2010 PC 63.2 62.6 70.2 72.8
IS 2011 SSC 65.9 66.4 70.3 72.9
SMOTE on (united) learning instances. Feature sets IS 2009 EC, IS 2010 PC, and IS SSC 2011 cor-
respond to the official sets of the Challenges (Emotion [ 72 ], Paralinguistic [ 75 , 179 ], and Speaker
State [ 77 ] held at INTERSPEECH in the respective years)
10.4.4.5 Summary
The automatic recognition of speakers' intoxication and sleepiness was shown. As
for the previous challenges, majority voting of the best participants' results lead to
the overall best results of UA 72.2 % (intoxication), and UA 72.5 % (sleepiness).
In [ 216 ], however, it was shown how intoxication recognition performance can be
further boosted by incorporating not a single speech-chunk, but a series of such. This
makes sense, as we are dealing with temporally 'more permanent' speaker states. In
addition, focus on specific linguistic entities such as tongue breakers was shown to
be beneficial. It seems promising to further elaborate on this findings for other more
permanent states and traits.
One of the other most promising future directions seems to be the coupling of
tasks—all these are somewhat influencing each other, and it seems intuitive to assess
for example age and sleepiness or emotion and gender together rather than in isola-
tion. Further ideas for future research and a summary of recent trends is also found
in [ 217 ].
References
1. Shriberg, E.: Spontaneous speech: how peoply really talk and why engineers should care. In:
Proceedings of Eurospeech, pp. 1781-1784. Lisbon (2005)
2. Schuller, B., Ablameier, M., Müller, R., Reifinger, S., Poitschke, T., Rigoll, G.: Speech com-
munication and multimodal interfaces. In: Kraiss, K.-F. (ed.) Advanced Man Machine Inter-
action. Signals and Communication Technology. Chapter 4, pp. 141-190. Springer, Berlin
(2006)
3. Lee, C.-C., Black, M., Katsamanis, A., Lammert, A., Baucom, B., Christensen, A., Georgiou,
P., Narayanan, S.: Quantification of prosodic entrainment in affective spontaneous spoken
 
 
Search WWH ::




Custom Search