Information Technology Reference
In-Depth Information
Table 1.5
Results for the subjectivity classification part
Neutral
Subjective
Macro-averaged
Accuracy
Pre
Rec
F1
Pre
Rec
F1
Pre
Rec
F1
NB baseline (all) 0.914
0.856
0.884
0.314
0.450
0.370
0.614
0.653
0.627
0.804
All
0.921
0.925
0.923
0.472
0.459
0.465
0.696
0.692
0.694
0.865
All-bow
0.918
0.923
0.921
0.457
0.440
0.449
0.688
0.682
0.685
0.861
All-postags
0.924
0.903
0.913
0.429
0.495
0.460
0.676
0.699
0.687
0.851
all-bow-postags
0.925
0.942
0.933
0.547
0.477
0.510
0.736
0.710
0.722
0.883
Bow
0.888
0.973
0.929
0.474
0.165
0.245
0.681
0.569
0.587
0.870
Discourse
markers
0.881
0.725
0.795
0.150
0.330
0.206
0.515
0.528
0.501
0.675
Postags 0.906 0.867 0.886 0.298 0.385 0.336 0.602 0.626 0.611 0.805
Sentiws 0.918 0.939 0.929 0.511 0.431 0.468 0.715 0.685 0.698 0.874
Valence shifters 0.874 0.795 0.833 0.136 0.220 0.168 0.505 0.508 0.501 0.722
Isolated, the SentiWS features are most suitable for subjectivity classification with a SVM. Our
approach outperforms the Naive Bayes baseline. The best performing feature set achieves an F1-
score of 0.72. It includes SentiWS terms, discourse marker and valence shifters
the best hyperparameter
and C for our SVM by employing a grid search. As our
corpus is highly unbalanced we set the penalty for class subjective 7 times larger
than for class neutral and the penalty for class positive 2 times larger than for
class negative.
Subjectivity Classification . For the assessment of our subjectivity classifier we
make use of all 851 annotated quotations. We consider all quotations tagged as
positive or negative being subjective. By doing so we prepare a corpus of 109
subjective and 742 neutral quotations. The first experiments evaluate our sentiment
features individually to examine their impact on the subjectivity classification task.
Table 1.5 shows the results. Isolated, the SentiWS features achieve the best F1-score
of 0.698. We determine the best-performing feature set, containing the SentiWS
term, valence shifter, and discourse-marker-based features, by conducting a feature
ablation study. We leave out one feature type in each experiment that does not improve
or even worsens the classification result. Using the best-performing features we
achieve a macro-averaged F1-score of 0.72 and succeed in improving the F1-score by
0.095 over the Naive Bayes baseline and by 0.024 over the F1-score exploiting solely
the SentiWS features. Regarding the classes neutral and subjective retrieving
neutral quotations works notably better than the retrieval of subjective quotations.
Polarity Classification . We evaluate our polarity classification approach on the
109 subjective quotations of the entire sentiment corpus. As with subjectivity classi-
fication the most relevant features for polarity classification are the SentiWS features
(Table 1.6 ). Including only SentiWS features our approach already achieves an F1-
score of 0.82. We are able to improve our results by adding POS tags and discourse
markers as features. Together, the three feature types achieve a macro-averaged F1-
score of 0.86 . The score is 0.062 higher than the F1-score that results by including
γ
 
Search WWH ::




Custom Search