Information Technology Reference
In-Depth Information
detection and polarity classification and train a separate SVM classifier for each
subtask. In order to find the most appropriate feature set for both tasks, we assess
a range of text classification features according to their applicability to sentiment
analysis. The results suggest that sentiment words are most suitable for both tasks
and that the classifiers perform best using a slightly different feature set. While
part-of-speech information positively effects polarity classification, incorporating
valence shifters and discourse markers improves subjectivity detection. Overall, the
subjectivity detection in news articles appears more challenging than the polarity
classification. To evaluate our work, we have created two corpora. The first corpora
aims to support developing and assessing methods for quotation extraction. It contains
direct and indirect quotations attributed with a quotation speaker and a reporting verb
or clue if available. The second corpus bases on our quotation corpus and provides
a sentiment label (positive, negative or neutral) for each quotation and an opinion
target, if it is explicitly mentioned in the quotation. Both corpora are freely available
for research purposes upon request.
In the future work, we plan to improve the recall of indirect quotations by auto-
matically detecting reporting verbs instead of using a predefined list. Concerning the
extraction of quotations speakers, we intend to incorporate a sophisticated approach
to co-reference resolution. The future work on our sentiment analysis approach will
include incorporating additional information during feature vector calculation to rep-
resent the text more precisely. We plan to shift our work toward topic-related and
context-dependent opinion retrieval and allow also other text parts than quotations
for sentiment analysis. In order to benefit from our results we plan to implement an
extended view on newspaper quotations. The users will be presented a direct compar-
ison of quotations expressed by different speakers according to a topic or entity, and
a timeline of opinions to facilitate monitoring developments and estimating trends.
Acknowledgments We would like to thank Neofonie GmbH for providing news articles and the
infrastructure for our demonstrator and developing important components of the news aggregator.
In particular, the topic detection and tracking component and the component mining a dynamic
topic hierarchy were designed and implemented by Neofonie GmbH. We would also like to thank
Neofonie GmbH for regular communication, many helpful discussions, and valuable suggestions.
Our thanks also go to Sascha Narr, Kerstin Schütt, Michael Hülfenhaus, Jonas Katins, Xenofon
Chatziliadis and Leonhard Hennig. This work was funded by the Federal Ministry of Economic
Affairs and Energy (BMWi) under funding reference number KF2392309KM1.
References
1. A. Akbik, M. Schenck, QuoteMine: A Repository of Newsworthy Quotes (Darmstadt, Germany,
2013)
2. A. Balahur, R. Steinberger, Rethinking sentiment analysis in the news: from theory to practice
and back, in Proceeding of WOMSA'09 (2009)
3. A. Balahur, R. Steinberger, E. van der Goot, B. Pouliquen, M. Kabadjov, Opinion mining
on newspaper quotations, in Proceedings of the 2009 IEEE/WIC/ACM International Joint
Search WWH ::




Custom Search