Information Technology Reference
In-Depth Information
6.6
Pseudo-Feedback
In this section we explore a very easy-to-implement and feasible way to take
advantage of relevance feedback without manual selection of relevant results by
human users. One of the major problems of relevance feedback-based approaches
is their dependence on manual selection of relevant results by human users. For
example, in our experiments we used judges manually determining if web-pages
were relevant using an experimental set-up that forced them to judge every result as
relevant or not, which is not feasible for actual search engine use.
A well-known technique within relevance feedback is pseudo-feedback , namely
simply assuming that the top x documents returned are relevant. Then, one can use
this as a corpus of relevance documents to expand the queries in the same manner
using language models as described in Sect. 6.3 . However, in general pseudo-
relevance feedback is a more feasible method, as human intervention is not required.
Using the same optimal parameters as discovered in Sect. 6.5.1.1 , tf with m
=
10
2 was again deployed, but this time using pseudo-feedback.
Can pseudo-feedback from hypertext Web search help improve the rankings of
Semantic Web data? The answer is clearly positive. Employing all ten results
as pseudo-relevance feedback and the same previously optimized parameters, the
best pseudo-relevance feedback result had an average precision of 0.6240. This
was considerably better than the baseline of just using relevance pseudo-feedback
from the Semantic Web to itself, which only had an average precision of 0.5251
( p
,
000 and
ε =
0
.
05). However,
asshownbyFig. 6.12 , the results are still not nearly as good as using hypertext pages
judged relevant by humans, which had an average precision of 0.8611 ( p
<
0
.
05), and also clearly above the 'best' baseline of 0.5043 ( p
<
0
.
05).
This is likely because, not surprisingly, the hypertext Web results contain many
irrelevant text fragments that serve as noise, preventing the relevant feedback from
boosting the results.
Can pseudo-feedback from the Semantic Web improve hypertext search? The
answer is yes, but barely. The best result for average precision is 0.4321 ( p
<
0
.
05),
which is better than the baseline of just using pseudo-feedback from hypertext Web
results to themselves, which has an average precision of 0.3945 ( p
<
0
.
<
0
.
05) and
the baseline without feedback at all of 0.4284 ( p
05). However, the pseudo-
feedback results perform significantly worse by a large margin when compared to
using Semantic Web documents judged relevant by humans as relevance feedback,
which had an average precision of 0.6549 ( p
<
0
.
05). These results can be explained
because, given the usual ambiguous and short one or two word queries, the Semantic
Web tends to return structured data spread out of over multiple subjects even moreso
than the hypertext Web. Therefore, adding pseudo-relevance feedback increases
the amount of noise in the language model as opposed to using actual relevance
feedback, hurting performance while still keeping it above baseline.
<
0
.
Search WWH ::




Custom Search