The Semantics of Search - Social Semantics: The Search for Meaning on the Web

Information Technology Reference

In-Depth Information

Fig. 6.3

The interface used to judge web-page results for relevancy

×

saved at 469

631 pixel resolution. The reason that the web-page was rendered

instead of a link given directly to the URI is because of the unstable state of the

Web, especially the hypertext Web. Even caching the HTML would have risked

losing much of the graphic element of the hypertext Web. By creating 'snapshot'

renderings, each judge at any given time was guaranteed to be presented with the

result in the same visual form. One side-effect of this is that web-pages that heavily

depend on non-standardized technologies or plug-ins would not render and were

thus presented as blank screen shots to the user, but this formed a small minority of

the data. The user-interface divided the evaluation into two steps:

•

Judging relevant results from a hypertext Web search: The judge was given the

search terms created by an actual human user for a query and an example relevant

web-page whose full snapshot could be viewed by clicking on it. A full rendering

of the retrieved web-page was presented to the user with its title and summary

(as produced by Yahoo! Search) easily viewed by the judge as in Fig. 6.3 .The

judge clicked on the check-box if the result was considered relevant. Otherwise,

the web-page was by default recorded as not relevant. The web-page results were

presented to the judge one at a time, ten times for each query.

•

Judging relevant results from a Semantic Web search: Next, the judge assessed

all the Semantic Web results for relevancy. These results were retrieved from the

Semantic Web using the same interface displayed to the judge in the first step as

showninFig. 6.4 , and a title was displayed by retrieving any literal values from

rdfs:label properties and a summary by retrieving any literal values from

rdfs:comment values. Using the same interface as in the first step, the judge

had to determine whether or not the Semantic Web results were relevant.

statistic was taken in order to

test the reliability of inter-judge agreement on relevancy judgments (Fleiss 1971).

Simple percentage agreement is not sufficient, as it does not take into account the

After the ratings were completed, Fleiss'

κ

Search WWH ::

Custom Search

Home