Databases Reference
In-Depth Information
For illustration, Figure 1.6 shows two abstractive summaries of our sample email conversation,
while Figure 1.7 shows one extractive summary of the same conversation. Notice how the level of
abstraction in abstractive summarization can vary considerably, with the first abstractive summary
being much more abstract than the second one.
Figure 1.6: Two abstractive summaries of our synthetic email conversation.
In Chapter 4 , we will see that most of the summarizers for text conversations developed so
far are fundamentally extractive in nature. However, in that chapter, we will also cover a few very
recent studies on applying abstractive summarization to text conversations [ Murray et al. , 2010 ].
Generic vs. Query-based Summarization Another important dimension related to the input of the
summarization process is whether the user is explicitly stating her information needs by means of a
query. If this is the case, a good summary should not be generated generically, but should focus on
the query, which, for instance, could refer to a particular event, date or person. In practice, a query-
based summarizer can focus on the query by taking the query into account when deciding whether
query-
based
to include some content (a sentence or a piece of information) in the summary. This is typically
done by measuring the overlap/similarity between that content and the query. A similar approach
can be followed for text conversations. For instance, a common feature used for measuring infor-
mativeness in email summarization is subject-line overlap or similarity (e.g. [ Nenkova and Bagga ,
2003 ]). If we combine the subject line with a user-provided query, we can generate query-dependent
summaries that tailor the summary to a particular information need. As another example, consider
work by Sharifi et al. [ 2010 ], where the task is automatically summarizing microblogs such as Twit-
ter messages. The algorithm takes as input a topic phrase (e.g., Ice Dancing) along with a set of
sentences from relevant tweets and it generates an extractive, query-based summary intended to
concisely convey why the topic is currently popular on Twitter (e.g., "'Ice Dancing Canadians Tessa
Virtue and Scott Moir clinch the gold in Olympic ice dancing; U.S. pair Davis and White win silver;
2/22/2010"').
For an example of a query-based abstractive summary of our synthetic email conversation, see
Figure 1.8 .
Search WWH ::




Custom Search