Digital Signal Processing Reference
In-Depth Information
Table 10.16 UA and WA for
BoNG with SVM and on-line
knowledge sources (OKS) for
three review classes (positive
(
[%]
BoNG
OKS
UA
53.99
38.80
WA
53.71
49.38
+
)/mixed (0)/negative (
))
Recall +
57.62
68.43
on Metacritic
Recall 0
43.91
37.32
Recall
60.43
10.66
Precision +
75.66
58.82
Precision 0
42.92
35.74
Precision
34.70
28.71
+
1if S
+
y
=
0 f
τ
S
τ +
(10.1)
1if S
0 can be chosen for the mixed class. The
best constellation observed was reached with
With
τ + , a range around y
=
6. Table 10.16
shows the results for this approach and for BoNG with SVM for the three-class task
on the test set.
Just as one might expect, accuracies drop in comparison to handling two classes.
BoNG with SVM overall lead to better results and provide more balanced values
across the classes from 43.91 % for mixed, up to 60.43 % for negative reviews.
One can now aim at synergistic fusion of the two methods. To this end, again
the optimal configuration as determined up to now is chosen for each approach.
An early integration on the feature level that preserves the knowledge up to the
final decision process is followed first. With the given correlation of the feature
streams, this is known to be beneficial [ 136 , 137 ]. Thus, a super-vector is created by
including scores of the knowledge-based approach in the BoNG feature vector prior
to SVM classification on this new vector. Table 10.17 shows according results. The
WA increases over BoNG 'stand-alone' (cf. Table 10.16 ) only by 0.13-53.84 %.
The opposite approach of late semantic fusion is a decision based on the pre-
dictions per model [ 136 , 137 ]. A tuning of 'whom to trust when' is thus possible,
i.e., it can be modelled which approach is most reliable for which class. The results
so far revealed a strength of the knowledge-based method for the recall of positive
reviews in the ternary task. This can be emphasised on by according weighting or
rules. SVMs are able to provide pseudo-probabilities P in the range of 0
τ =−
1
.
9, and
τ + =
0
.
1
per class based on the distance to the hyperplane and the chosen multi-class discrimi-
nation strategy. By class, let us denote these pseudo-probabilities in the given ternary
case as P (negative), P 0 (mixed), and P + (positive). Now, with the score S of the
knowledge-based approach (cf. Sect. 6.3.4.4 ) , we can influence when to decide for
the positive class by setting suited conditions. The SVM decision is decided for if
these conditions are not met. A number of such conditions were tested and are sum-
marised alongside the results in Table 10.17 . For the knowledge-based score, S
P
>
0,
>
.
τ +
and S
0
6—the positive discrimination threshold
decided for above—were
considered, and for the SVMs, P + >
0, P =
((
P + >
) (
P 0 >
))
0, and
0
0
.Asa
 
Search WWH ::




Custom Search