Mining Text Conversations - Methods for Mining and Summarizing Text Conversations

Databases Reference

In-Depth Information

F-score for the different dialogue acts ranging from . 44- . 85. As for features used by the classifiers,

the best performance was achieved with a rich set of features, which included features based on the

identification of time and date expressions, part of speech and bigrams. For a clear illustration of

why bigrams would help in the task, consider the bigrams “I will” and “will you” . While these two

bigrams would strongly indicate a commitment and a request dialogue act, respectively, the three

constituent words, “I”, “you”, “will” , in isolation, would be much less informative.

A key limitation of Cohen et al's proposal is that it does not exploit the tendency of dialogue

acts to occur in adjacency pairs. It blindly classifies one email message at the time, without considering

dependency between a message and its neighbor messages in the email thread. The same research

group addressed this limitation the following year in Carvalho and Cohen [ 2005 ], where they present

an iterative collective classification algorithm 8 in which two classifiers are trained for each dialogue

act d i . One classifier, Content d i , only looks at the content of the message (it is the same classifier

presented in Cohen et al. [ 2004 ]), whereas the other classifier, Context d i , takes into account both

the content of the message and the context in which the message occurs, i.e., the dialogue act labels

of its parent and children. The algorithm works as follows.

1. Initialize the labels of each message by applying the Content classifiers (which do not need

labels for the other messages).

2. Repeat for a given number of iterations (60 in the proposal).

Revise the labels of all the messages by applying to each message all the Context classi-

fiers.

Figure 3.9 illustrates the algorithm's key operations.

Experimental results show that taking the context into account does improve performance.

However, improvements are modest and only for some of the dialogue acts, which indicates that

exclusively supervised approaches to email dialogue act labeling may not be the ideal solution.

Similar results are obtained by Shrestha and McKeown [ 2004 ], who propose a supervised

approach for a rather different dialogue act labeling task. Instead of labeling each message in an

email thread with a subset of the labels in a tagset, they only determine whether any two sentences

in the thread form a question-answer adjacency pair. On the one hand, this is a more complex task,

because it operates at a finer level of granularity (single sentences vs. whole messages), but on the

other hand, it is a simpler task because it is limited to identifying only two dialogue acts.

In their work, the detection of question-answer pairs is broken down into two steps. First,

you need to identify all the questions in the thread. Next, for each question you need to detect the

corresponding answers. Let us examine these two steps in order.

On the surface, it may appear that determining whether a sentence is a question or not in

written conversations, like email, should be straightforward, because of the use of the question mark.

However, Shrestha and McKeown [ 2004 ] discuss three reasons why relying on question marks is

not sufficient.

8 This algorithm is an implementation of a Dependency Network [ Heckerman et al. , 2001 ].

Search WWH ::

Custom Search

Home