Intelligent News Aggregator for German with Sentiment Analysis - Smart Information Systems: Computational Intelligence for Real-Life Applications - page 38

Information Technology Reference

In-Depth Information

Table 1.7

Results for the overall sentiment classification

Gold subj. answers

+

Subj. classification

+

Subj. classification

+

Pol. classification

Gold pol. answers

Pol. classification

P

R

F1

P

R

F1

P

R

F1

Positive

0.849

0.873

0.861

1.0

0.521

0.685

0.642

0.479

0.548

Negative

0.75

0.711

0.730

1.0

0.105

0.191

0.077

0.053

0.063

Neutral

1.0

1.0

1.0

0.916

1.0

0.956

0.912

0.949

0.93

Macro-avg

0.866

0.861

0.864

0.972

0.542

0.611

0.543

0.493

0.514

Accuracy 0.977 0.920 0.870

Our approach achieves a macro-averaged F1-score of 0.51. While the polarity classifier performs

reasonable, the subjectivity classifier introduces a large error. Many negative quotations are marked

as neutral and therefore are not further examined by the polarity classifier. Given correct subjectivity

labels the overall performance rises to an F1-score of 0.86

1.4.7 Conclusion

We solve the problem of sentiment classification of quotations in news articles by

employing a two-stage approach where we first separate subjective from neutral quo-

tations and, second, categorize the subjective quotations as either positive or negative.

Our approach performs the best for both tasks with only a subset of the presented

sentiment features. In either case SentiWS features strongly contribute to an efficient

sentiment classification. Leaving them out decreases the F1-score considerably. In

contrast to the SentiWS features, leaving out simple bag-of-word features (uni- and

bigrams) increases the classification quality so that we exclude them from the final

feature sets. The relatively low overall F1-score of 0.51 mainly results from the out-

put of the subjectivity classifier. The subjectivity classifier introduces a large error

in the first step. It misses many subjective quotations which the polarity classifier

would tag correctly. Particularly, the majority of negative quotations is filtered out by

the subjectivity classifier. Generally speaking, separating objective from subjective

quotations is especially challenging in our scenario. It is easier to classify quotations

as subjective if they are positive. If quotations are negative the algorithm classi-

fies them more often as neutral. The polarity classification quality for negative and

positive quotations is comparable. As Pang et al. [ 37 ] we find that incorporating posi-

tion information into the feature vectors hardly influences sentiment classification

effectiveness and therefore can be excluded from the feature vectors.

Inspired by Polanyi and Zaenen [ 40 ] we intend in our future work to imply more

contextual shifters and patterns for German to calculate contextual feature weights

instead of only encoding the presence and frequency of valence shifters. At the

same time we plan to consider discourse markers for feature weight calculation

following Mukherjee and Bhattacharyya [ 32 ]. In addition, appraisal groups may serve

as supplementary information for the feature vectors [ 53 ]. Considering sentiment

Next Page

Smart Information Systems: Computational Intelligence for Real-Life Applications

Search WWH ::

Custom Search

Home