Digital Signal Processing Reference
In-Depth Information
Table 10.14 OOV statistics for Metacritic for no OOV resolution, stemming, and stemming with
on-line knowledge sources-based (OKS) resolution
# Words
Vocabulary
OOV words
OOV events
Baseline
83 328
25 217 (30.3 %)
38 244 (3.0 %)
Stemming
62 212
19 228 (30.9 %)
29 482 (2.3 %)
Stemming and OKS
62 212
14 123 (22.7 %)
22 746 (1.8 %)
Table 10.15 UA and WA for
BoNG with SVM and on-line
knowledge sources (OKS) for
two review classes (positive
(
[%]
BoNG
OKS
UA
77.73
60.44
WA
77.37
69.42
+
)/negative (
)) on
Recall +
77.07
77.21
Metacritic
Recall
78.39
43.67
Precision +
92.18
81.92
Precision
50.84
36.70
amount to 3.0 % of the total 1 288 384 words in the database (cf. Table 10.14 ). The
delta between OOV words and OOV events is probably owing to proper nouns such
as in the title of a film, or the name of actors. OOV words—and by that OOV events—
are influenced during stemming: After this step, the OOV words are at a similar level
of 30.9 %. However, the OOV events decrease to 2.3 % of all words. As additional
solution, OOV words can be substituted by non-OOV 'synonyms' with the help of
the on-line knowledge sources ConceptNet and WordNet.
Next, let us compare the BoNG and SVM approach with the on-line knowledge
source domain-indpendent one (cf. Sect. 6.3.4.4 ) on the same test data. The optimal
configuration as determined so far is used. In the case of BoNG, g min
=
1 and
g max =
3, features are normalised TFIDF, and OOV resolution is applied.
The results are found in Table 10.15 , and show a clear advantage for BoNG with
SVM with a gap of 7.95 % WA owed to a 34.72 % absolute difference in the recall
of negative reviews. An explanation might be the inability of the proposed domain-
independent model to cope with negation, assuming negation to be more frequent
in negative reviews. In fact, this is a common non-trivial problem for syntax-driven
approaches [ 120 , 133 ]. BoNG features on the other hand model negation, provided
that it occurs in proximity of the word to be negated.
We will now turn to three classes of sentiment, which is challenging also, as mixed
or neutral reviews are particularly challenging [ 134 , 135 ]. For the syntax-driven
approach, the decision function needs to be extended to handle ternary recognition
tasks. This is achieved by a split into two binary tasks: negative plus mixed versus
positive and negative versus mixed plus positive. For optimal configuration, these
are 'tuned' in isolation, and two decision thresholds
are observed. These
thresholds form the basis of an overall valence decision function, where y is the
output class label, and S is the score of the sequence of words (cf. Sect. 6.3.3.4 ) :
τ
and
τ +
 
Search WWH ::




Custom Search