Databases Reference
In-Depth Information
Figure 1.8: Sample query-focused (abstractive) summary of our synthetic email conversation.
example, a human-authored abstract in a science journal will usually give a high-level overview of
the experiments and conclusions, but may highlight a key finding in some detail.
Domain-Specific vs.General-Purpose Summarization We have mentioned conversation types such
as emails, meetings, blogs and chats, and we refer to these as separate conversation modalities . Modal-
ity here refers to a means or mode of communication, where a particular conversation modality may
modality
be associated with both distinct communication technologies as well as distinct social conventions
and language characteristics. From a more general viewpoint, without reference to communication
or language, these can also be considered distinct domains , and we will use the two terms more or
domain
less interchangeably here.
For many tasks there is a tension between developing solutions that are general and broadly
applicable, and implementing tools that work only in specific domains, but are highly effective.
Summarization is not an exception in this respect. Researchers have worked both on domain specific
systems (e.g., McKeown et al. [ 2002 ]) for news, Zhou et al. [ 2004 ] for biographies) and on general
purpose platforms [ Radev et al. , 2004 ]. A related distinction for summarizing text conversation,
that will be discussed in Chapter 4 , is whether a summarization approach can be only applied to
a particular conversational modality (e.g., email), or whether it can work on any text conversation,
independently from its modality. While most of the summarizers described in Chapter 4 are do-
main/modality specific, as they exploit peculiar features of those modalities (e.g., the subject line for
emails, user ratings for blog posts), we will also cover recent attempts to design a multi-modal sys-
tem [ Murray and Carenini , 2008 ] that relies only on features common to all multi-party interaction,
such as speaker dominance in the conversations, turn-taking, lexical cohesion, etc. This system is
not only capable of summarizing conversations in different modalities (e.g., meeting, emails, blogs),
but it can also work on conversations spanning multiple modalities (e.g., a transcript of a meeting
that was followed up by an email conversation). A multi-modal approach presents two additional,
critical advantages. First, by only harnessing features shared by all the modalities, it can facilitate the
transfer of knowledge from one modality to another [ Sandu et al. , 2010 ], which in machine learn-
ing is called domain adaptation [ Daumé and Marcu , 2006 ]. Secondly, this general approach should
domain
adaptation
easily cover novel conversational modalities that are being constantly created by people's creativity
and technological advancements.
Search WWH ::




Custom Search