Intelligent News Aggregator for German with Sentiment Analysis - Smart Information Systems: Computational Intelligence for Real-Life Applications

Information Technology Reference

In-Depth Information

Fig. 1.4

The annotation tool used for the creation of the quotation extraction corpus

We asked the annotators to identify all quotations in a news article and advised

them to mark for each quotation the quoted text, the quotation holder, and a reporting

verb if available. A screenshot of the annotation tool is shown in Fig. 1.4 . For quota-

tion holders not referenced by their proper name but by, e.g., a personal pronoun or

only by the last name, the annotators should assign the full proper name if possible.

If a quotation or a reporting verb was composed of several parts, the annotators were

asked to mark all parts ( teilte der Sprecher mit , the spokesman said ). They were also

advised to mark if a news article does not contain any quotes at all.

We succeeded in annotating 714 news articles. 339 of the news articles were

annotated twice, 27 three times, and 2 even four times. The remaining 347 news

articles were annotated by only one annotator. The annotators exactly agreed upon

the quotations in 287 news articles. At that point we speak of exact agreement if

the boundaries of the quotation holder, the reporting verb, and the quote text match

accurately comparing the annotated tokens. Finally, the resulting corpus of 287 news

articles contains 383 quotations, whereof 256 quotations are direct, 98 indirect, and 29

mixed (including at least a direct and indirect part) quotations. A news article contains

1.3 quotations in average. 87 % of the quotations are attributed with a reporting verb.

We succeeded in annotating a quotation holder for each quotation. For 202 quotation

holders we could resolve the reference and assign proper names.

1.3.5 Evaluation

We evaluate our quotation extraction approach using a human-annotated corpus of

287 news articles where at least two annotators exactly agreed upon the contained

Search WWH ::

Custom Search

Home