Information Technology Reference
In-Depth Information
20.5
Conclusions and Future Work
In this work, we approached the notion of conflict occurring in call center
interactions as a complex problem which is decomposed in subtasks related to (a)
perceiving and perceptually decoding the emotions occurring in such interactions,
(b) automatically classifying them, and (c) exploring the turn-taking structure to find
cues and patterns related to conflict.
With regard to the perceptual decoding of vocal emotional expressions, the
high agreement scores between Greek annotators indicate the existence of salient
perceptual cues allowing to adequately perceive the emotional trace of an utterance,
independently of the context. The small-scale perceptual experiment involving
native and nonnative raters showed that familiarity with the linguistic content
largely improves the assessment of emotion in positive and negative classes.
Moreover, an SVM-based algorithm that classifies emotional units extracted from
the conversations as positive or negative was presented, the best version of which
(SVM-RBF2 N D 6) obtained an accuracy score that is 6 % higher than a majority
classifier. Finally, in a subset of our corpus (churns), we measured the distribution of
turn-taking types, and we explored the association of overlapping speech as well as
of overlapping values in turn-taking and emotion to cue the presence of conflict. In
this case, it was found that overlapping speech occurs mostly with conflict-related
labels in turn-taking (e.g., grab, yield) and that these labels are more correlated with
negative emotions.
In future, we plan to improve the classification task by using speaker diarization
and speech segmentation techniques to automatically segment recordings and come
up with conversation units of variable duration that includes additional features
coming from the turn-taking structure and overlapping speech points. Moreover,
to allow for generalizations, a cross-lingual study on the human ability to decode
perceptually emotional vocal expressions derived from call center dyadic inter-
actions is foreseen, involving more subjects and distinct experimental conditions.
Future work will also investigate various ways of incorporating and modeling the
temporal sequence and transitions of emotional states, both within the same speaker,
and between the two speakers, to show conflict escalation and de-escalation,
and discourse structures to improve the automatic classification of conflictual
conversations from a business perspective.
Acknowledgments The research leading to these results has been partially funded by POLYTRO-
PON project (KRIPIS-GSRT, MIS: 448306). Also, the participation to Dagstuhl Seminar 13451
“Computational Audio Analysis” held from Nov 3 to 8, 2013, in Wadern, Germany, inspired Anna
Esposito to contribute to this work.
Search WWH ::




Custom Search