Databases Reference
In-Depth Information
Disentangling Synchronous Conversations On the surface, synchronous conversations, such as
meetings and chats, consist of the linear sequence of turns appearing one after the other as the
conversation evolves over time. However, a single stream of turns often contains several simultaneous
conversations. For instance, Aoki et al. [ 2006 ] found that in speech conversations involving 8 to 10
speakers an average of 1.76 distinct conversations were occurring at the same time.This phenomenon
was shown to be even more pronounced in chats, where a recent study has found that an average of
2.75 conversations were simultaneously active [ Elsner and Charniak , 2010 ].
The difference between spoken conversations and chats can be explained by considering one
aspect of chats that make them similar to asynchronous text conversations (e.g., email): chats do
not allow participants to control the positioning of their contributions [ Smith et al. , 2000 ]. In other
words, if you send an answer to a question in a chat, since several participants can simultaneously
send different messages, there is no guarantee that your answer will follow the original question.
Other contributions, possibly unrelated to the question, may appear between the question and your
answer.
Since in chats and in other synchronous conversations, what appears to be like a single stream
of turns often contains multiple, independent, interwoven conversations, we are faced with the
challenge of identifying those conversations; that is, we must disentangle the conversations.
A two-step approach to disentangling conversations has been recently proposed
by Elsner and Charniak [ 2010 ]. The first step is based on supervised classification. For each pairs of
turns (x, y) in a chat stream, a binary classifiers determines how likely is that the two turns x and y
belong to the same conversation. The classifier is trained on a set of features that are grouped into
three classes (see Elsner and Charniak [ 2010 ] for a complete list):
￿ Chat-specific: including, for instance, the temporal distance between x and y and whether x
mentions the speaker of y (or vice-versa).
￿ Discourse: including, for instance, whether x and y uses a greeting word (“hello” etc.), an
answer word (“yes”, “no” etc.), or the word “thanks”; or asks a question (marked explicitly with
a question mark)
￿ Content: including, for instance, whether x and y both use technical jargon, neither do, or only
one does.
In the second step of the disentangling process, turns are clustered by relying on the output
of the classifier used in step one. In essence, the clustering algorithm tries to make sure that pairs
of turns likely to belong to the same conversation (according to the classifier), will be in the same
cluster, while pairs that are unlikely to belong to the same conversation (again, according to the
classifier) should end up in different clusters. The goal is that each resulting cluster will correspond
to a different conversation.
Unfortunately, finding the optimal solution to this clustering problem is intractable. However,
the authors show that acceptable solutions can be found with heuristic algorithms.
Search WWH ::




Custom Search