Databases Reference
In-Depth Information
The meetings were recorded at three European locations. The participants consist of both
native and non-native English speakers, and many of them are students.
The AMI corpus is freely available 2 and contains numerous annotations, such as the sum-
marization annotation described below, and multi-modal artefacts such as PowerPoint slides, notes,
and whiteboard events.
ICSI Corpus The ICSI meeting corpus [ Janin et al. , 2003 ] is a corpus of 75 natural (i.e., non-
scenario) meetings. As with the AMI non-scenario set, these are meetings that would have been
held anyway and feature a variable number of participants. Because many of the meetings in the
corpus are gatherings of ICSI researchers themselves, the topics tend to be specialized and technical,
e.g., discussions of speech and language technology. The average length of an ICSI meeting is
approximately one hour, which is greater than the average AMI non-scenario meeting (
15-45
minutes).
Like the AMI corpus, the ICSI corpus meetings feature both native and non-native English
speakers. All meetings in the corpus were recorded at ICSI in Berkeley, California. Unlike the AMI
scenario meetings and similar to the AMI non-scenario meetings, there are varying numbers of
participants across meetings in the ICSI corpus, with an average of six but sometimes as many as
ten per meeting.
Unlike the AMI corpus, which is multi-modal and contains a variety of information such
as slides, whiteboard events and participant notes, the ICSI corpus consists entirely of speech and
relevant annotations. The ICSI corpus can be freely downloaded 3 and additional annotations of
the ICSI meetings are available via the AMI corpus download. Both corpora were annotated with
similar summarization annotation schemes as part of the AMI project, and we will describe those
annotations shortly. However, we first describe some basic concepts necessary to understand these
annotations.
Utterances, Dialogue Acts and Disfluencies Usually when we talk about extractive summarization,
we are talking about extracting sentences from a document. However, with spoken conversations
such as meetings, people typically do not speak in complete, well-formed sentences. Their utterances
may be disfluent and ungrammatical. The utterances may be peppered with filled pauses such as uh
utterances
and um , indicating that the speaker is thinking. Utterances may overlap as speakers interrupt one
another, or a sentence may be abandoned if the speaker thinks the listener already understands. Filled
pauses, repetitions and fragments are all examples of disfluencies , phenomena which tend to make
disfluencies
speech less fluent and grammatical. Disfluencies can particularly pose a problem when transcribing
speech, as the resulting transcript can be difficult to read if the disfluencies are not corrected or
removed.
As we saw in Section 1.4.1 , one way utterances can be analyzed is by identifying dialogue
acts [ Stolcke et al. , 2000 ]. A dialogue act represents the illocutionary meaning of an utterance, or
dialogue
act
2 http:/ /corpus.amiproject.org/
3 http://www.idiap.ch/mmm/corpora/icsi
Search WWH ::




Custom Search