Information Technology Reference
In-Depth Information
tazi evaluates her approach on a hand-annotated dataset of 500 short texts about
celebrities. Her unsupervised polarity classification method outperforms standard
supervised classifiers on the given dataset. The main contribution of Li et al. [ 27 ]is
an annotation scheme for labeling sentiments in German political news articles and
a dataset manually annotated according to the presented scheme. Each annotation
frame consists of the text anchor (labeled as idiom, phrase, word or compound noun),
the target, the source, and auxiliary words that may be intensifiers, diminishers, or
negations. In addition, the opinion frames may be marked with the attitude's polarity,
type (context-dependent or -independent), and intensity. With the aid of the relation
extraction tool DARE 19 and the annotated relation examples, rules are automatically
learned to extract the opinion source, target, and polarity. In their first experiments the
authors achieved promising results. Another corpus for German sentiment analysis on
news articles is contributed by Scholz et al. [ 46 ]. The corpus consists of around 1,500
statements labeled with the viewpoint (corresponding to our “opinion target”), either
CDU 20 or SPD, 21 and the tonality of the statement (positive, neutral, negative). The
authors use parts of the dataset to generate sentiment dictionaries containing entries
scored with different measures. For the sake of media response analysis, the authors
evaluate news material and propose a supervised machine learning approach which
is similar to ours. Scholz et al. analyze the news in two stages. First, they detect
subjective statements and, second, they classify the subjective statements as either
positive or negative. In contrast to our findings, the subjectivity detection seems to
perform better on the dataset of Scholz et. al. than the polarity classification part.
Detailed surveys on opinion mining and sentiment analysis can be found in
[ 28 , 36 , 48 ].
1.4.3 Sentiment Classification
We solve the problem of quotation sentiment classification by employing a super-
vised two-stage approach. We first apply a subjectivity detection step where we mark
quotations as either neutral or subjective. We then classify all subjective quotations
according to their polarity in either positive or negative quotations. As a result of
our sentiment classification approach each processed quotation is labeled as either
neutral, positive, or negative. For both tasks, subjectivity and polarity classification,
we train separate Support Vector Machine (SVM) classifiers [ 8 , 50 ] with a different
feature set and with different hyperparameters. We choose a radial basis kernel for
both SVMs and select the hyperparameters
and C by performing tenfold cross-
validation on the dataset described in Sect. 1.4.5 . We represent the quotations as
vectors of diverse features (Sect. 1.4.4 ). Among others we include the part-of-speech
tags and sentiment words as features that turned out to be essential for sentiment
γ
19
http://dare.dfki.de/ .
20
Christlich Demokratische Union Deutschlands (Christian Democratic Union (Germany)).
21
Sozialdemokratische Partei Deutschlands (Social Democratic Party of Germany).
Search WWH ::




Custom Search