Information Technology Reference
In-Depth Information
4. ESL Type 4 requests indicate that the user wants to examine one-tenth of all relevant documents
and how many irrelevant documents the user has to examine in order to achieve this goal. In this
case, all relevant documents in the returned set of 200 have to be identified before the 10 percent
can be counted. On average AlatVista would have to examine about 8 irrelevant documents before
reaching the goal, while it only takes MARS fewer than one irrelevant documents.
5. ESL Type 5 requests examine up to a certain number of relevant documents. The example quoted
in Cooper's paper (Cooper 1968) was five. For AltaVista, it takes about 26 irrelevant documents
to find five relevant documents, while MARS requires only about 17.
goals and Metrics of the study
Since the early days of search engines in early 1990s, relatively few performance studies about search
engines have been available to the public. Researchers and engineers at Google published a few papers
about their systems with some mention of the performance (Ghemawat et.al. , 1999; Barroso et.al. , 2003).
Most other performance comparisons come as news reports from users' perceptions, that is, how satis-
fied the users feel about a particular search engine. The goal of this study is to assess the performance
of MSE from a user's point of view with collected statistics. The study is trying to answer the follow-
ing questions. How long would it take for a search engine to respond to a user query? How many total
relevant results are there from a search engine's point of view? Given that a typical user cannot examine
all returned results, which is typically in the order of millions, how many of the top-20 results returned
by a search engine are actually relevant to the query from a user's point of view? We also compare
the performance of search engines in these respects. The search engines involved in the study include
Microsoft Search Engine (beta version) (MSE, 2005), AlltheWeb (ATW, 2008), Google (Google, 2008),
Vivisimo (Vivisimo, 2008), and Yahoo! (Yahoo, 2008).
A number of performance metrics were measured in this study. The average response time is a mea-
sure of duration between the time when a query is issued and the time when the response is received,
as seen by the user's computer. Since a query typically retrieves hundreds and thousands of pages, we
simply measure, separately, the response time for the first page of URLs (typically 10 URLs), and then
the following four pages of URLs. The reason for the separation between the first and the rest of the
pages comes from the fact that it takes much more time to generate the first page than the rest of the
pages. The second piece of statistics collected is the number of relevant URLs per query posted by the
search engines. Although this is not necessarily a measure of how accurate the search results are, nor
a measure of how large the collected data for the search engines is, it is an interesting indication of the
data set kept by a search engine. The third measurement is a user-perceived relevance measure for the
queries. The authors sent 27 randomly chosen queries to MSE and the other peer search engines, the
relevance of the first 20 returned results from each of the search engines is examined manually. The
single value measurement RankPower (see discussion in Section previous section) is used to compare
the performance of the selected search engines from an end-user point of view.
Search WWH ::




Custom Search