Summarizing Text Conversations - Methods for Mining and Summarizing Text Conversations

Databases Reference

In-Depth Information

4.4.1 ABSTRACTIVE CONVERSATION SUMMARIZATION: A DETAILED

CASE STUDY

Recent work on summarizing multi-domain conversations has taken a more abstractive approach,

generating novel text to describe the conversation rather than extracting sentences from the con-

versation itself. We will describe one system in considerable detail to see how abstractive systems

differ from the extractive systems described previously. In the system of Murray et al. [ 2010 ], the

abstractive summarizer proceeds in a pipeline of interpretation , transformation , and generation .We

can first describe each of these stages at a high level:

Interpretation. Mapping the input conversation to a source representation.

Transformation. Transforming the source representation to a summary representation.

Generation. Generating a summary text from the summary representation.

At a system level, the Murray et al. abstractive system carries out interpretation by mapping

interpreta-

tion

conversation sentences to a simple conversation ontology written in OWL/RDF. This entails popu-

lating the ontology with instance data corresponding to the particular conversation participants, the

entities or topics discussed, and dialogue-acts such as decisions being made, problems encountered,

and opinions expressed. These latter sentence-level phenomena are determined using supervised

classifiers and a variety of structural, lexical and conversation features. The interpretation stage also

involves detecting messages , which are essentially collections of sentences which mention the same

entity, belong to the same participant and have the same dialogue act type. That is, a message is an

abstraction over multiple sentences.

The transformation stage is responsible for selecting the most informative messages. The

transfor-

mation

content selection is carried out using Integer Linear Programming (ILP), where a function involving

message weights and sentence weights is maximized given a summary length constraint. Messages

are weighted according to the number of sentences they contain (i.e., roughly how much information

they express), while sentences are weighted according to their posterior probabilities derived from the

supervised classifiers in the preceding interpretation stage (i.e., the predictions of decisions, actions,

problems and sentiment). The idea is that sentences relating to these types of phenomena should be

included in the summary. The output of the transformation stage is simply a set of messages.

The generation stage takes those selected messages and creates a textual summary by associ-

generation

ating elements of the ontology with linguistic annotations. For example, participants are associated

with an identifier such as their name, email or role in an organization. Topics or entities are simply

weighted noun phrases from the conversation. An individual summary sentence is realized by as-

sociating a verbal template with the message type. For example, instances of DecisionMessage are

associated with the verb make , have a subject template set to the noun phrase of the message source

(the participant), and have an object template [NP a decision PP [concerning]] where the object of

the prepositional phrase is the noun phrase associated with the message target.

This system architecture is very similar to data-to-text systems such as described in Portet et al.

[ 2009 ] and more generally in Reiter and Dale [ 2000 ], with the primary difference being textual input

Methods for Mining and Summarizing Text Conversations

Search WWH ::

Custom Search

Home