The Semantics of Search - Social Semantics: The Search for Meaning on the Web

Information Technology Reference

In-Depth Information

6.6

Pseudo-Feedback

In this section we explore a very easy-to-implement and feasible way to take

advantage of relevance feedback without manual selection of relevant results by

human users. One of the major problems of relevance feedback-based approaches

is their dependence on manual selection of relevant results by human users. For

example, in our experiments we used judges manually determining if web-pages

were relevant using an experimental set-up that forced them to judge every result as

relevant or not, which is not feasible for actual search engine use.

A well-known technique within relevance feedback is pseudo-feedback , namely

simply assuming that the top x documents returned are relevant. Then, one can use

this as a corpus of relevance documents to expand the queries in the same manner

using language models as described in Sect. 6.3 . However, in general pseudo-

relevance feedback is a more feasible method, as human intervention is not required.

Using the same optimal parameters as discovered in Sect. 6.5.1.1 , tf with m

=

10

2 was again deployed, but this time using pseudo-feedback.

Can pseudo-feedback from hypertext Web search help improve the rankings of

Semantic Web data? The answer is clearly positive. Employing all ten results

as pseudo-relevance feedback and the same previously optimized parameters, the

best pseudo-relevance feedback result had an average precision of 0.6240. This

was considerably better than the baseline of just using relevance pseudo-feedback

from the Semantic Web to itself, which only had an average precision of 0.5251

( p

,

000 and

ε =

0

.

05). However,

asshownbyFig. 6.12 , the results are still not nearly as good as using hypertext pages

judged relevant by humans, which had an average precision of 0.8611 ( p

<

0

.

05), and also clearly above the 'best' baseline of 0.5043 ( p

<

0

.

05).

This is likely because, not surprisingly, the hypertext Web results contain many

irrelevant text fragments that serve as noise, preventing the relevant feedback from

boosting the results.

Can pseudo-feedback from the Semantic Web improve hypertext search? The

answer is yes, but barely. The best result for average precision is 0.4321 ( p

<

0

.

05),

which is better than the baseline of just using pseudo-feedback from hypertext Web

results to themselves, which has an average precision of 0.3945 ( p

<

0

.

<

0

.

05) and

the baseline without feedback at all of 0.4284 ( p

05). However, the pseudo-

feedback results perform significantly worse by a large margin when compared to

using Semantic Web documents judged relevant by humans as relevance feedback,

which had an average precision of 0.6549 ( p

<

0

.

05). These results can be explained

because, given the usual ambiguous and short one or two word queries, the Semantic

Web tends to return structured data spread out of over multiple subjects even moreso

than the hypertext Web. Therefore, adding pseudo-relevance feedback increases

the amount of noise in the language model as opposed to using actual relevance

feedback, hurting performance while still keeping it above baseline.

<

0

.

Social Semantics: The Search for Meaning on the Web

Search WWH ::

Custom Search

Home