Databases Reference
In-Depth Information
email posing the two questions Q 1 and Q 2 . However, this is not the case for the answer A 22 , which
appears in Email-5, after two emails, Email-3 and Email-4.
Email4
Email3
>>Question Q 1
Answer A 12
>Answer A 11
>>Question Q 2
>Question Q 1
Answer A 11
>Question Q 2
Email5
Email1
>>>Question Q 1
>Answer A 12
>>Answer A 11
>>>Question Q 2
Answer A 22
Question Q 1
Question Q 2
Email2
>Question Q 1
>Question Q 2
Answer A 21
Figure 3.10: Sample email thread that starts with Email-1 which contains two questions, Q 1 and Q 2 .
These questions receive multiple answers in the following four emails. An answer labeled A i,j
means
answer j to question i .
For the answer detection task, Shrestha and McKeown also propose a supervised classification
approach, where a binary classifier, given a question q , can determine for any utterance u i following
q in the thread, whether or not u i is a response to q . Even by using a large and complex set of
features, based on the lexical similarity between q and u i as well as the position of q and u i in the
thread, the performance of this approach is modest (F-scores are in the 0.5-0.7 range for different
training data).
One critical limitation of this work is that it does not consider quotation as a source of
information. As we will see in Section 3.4.5 , quotation can be effectively exploited to create a finer-
level representation of the conversational structure, which, we will argue, can simplify several mining
task, including the dialogue act labeling one. For instance, looking again at Figure 3.10 , the answer
A 22 is far from Email-1 (which posed the corresponding question Q 2 ), but it is adjacent to the
quotation of Q 2 (in Email-5).
Although the supervised methods we have discussed so far have generated very useful insights
on the task of dialogue act labeling of text conversations, they do require large amounts of annotated
data for training, which is not only difficult and extremely time-consuming to build, but also needs to
be created for any new conversational modality. By comparison, semi-supervised methods represent
a valid alternative, since they can be easily applied to a new domain, as long as you have a considerable
amount of unlabeled data in that domain.
Search WWH ::




Custom Search