Information Technology Reference
In-Depth Information
interface as a common Web user would see. Thus the authors decided not to use Google as a comparison.
The data from Vivisimo was also not collected because of Vivisimo's relatively small data sets.
Also collected in this set of experiments are the total number of relevant pages that each search
engine claims to have for a given query, and the processing time that the search engine takes to service
the query. The processing time is typically listed on each page, that is, search engines process and return
each page separately. Figure 1 illustrates this point, showing that there are a total of about 36,500,000
pages related to the query “thread”, and it took Yahoo! 0.1 seconds to process the first page. Other search
engines including MSE have the similar features.
r esul ts and anal ysis
In this section, we present the results from the experiments and some observations about the results.
The first set of results reported here is the search quality. This is measured by the average number of
relevant URLs among the first 20 returned URLs, the average rank, and the RankPower . Notice that the
RankPower measure has a theoretical lower bound of 0.5. The closer to that value, the better the search
quality. Table 3 shows the results from the five search engines we tested.
From the table one can tell that Google has the most favored RankPower measure because it contains
the highest average number of relevant URLs (13.52) in the results AND these relevant URLs are placed
relatively high on the returned list (average 10.33). On the other hand, MSE doesn't seem to be doing
well in the measure of RankPower . However, Microsoft's new search engine seemed to have included
a very diverse array of results for the queries that we sent to it, while Google's results seemed to be
more focused. For example, when the “basketball” query was given to Microsoft, the results included
scouting/recruitment and high school basketball. Google focused on the more popular NBA and col-
legiate levels of basketball. This seems fairly self-evident: Google became the search leader because of
its high rate of return for more popular results based on its PageRank algorithm (Brin & Page, 1998).
MSE seems to return more diverged results with “high novelty”. This observation is supported by the
results from a number of queries. If the number of relevant URLs does not reveal intuitively the sig-
nificance, the percentage of pages that are relevant among the total number of returned pages gives us
more information. The average ranks from different search engines don't differ greatly, ranging from
10.32 to 10.56. Thus a measurement of their “deviation” becomes important. The RankPower measure
captures some sense of the deviation of a set of values. The RankPower value of Google for example
Table 3. Average number of relevant URLs, average rank, and rankpower for the 27 queries measured
from the first 20 return results
Search Engine
AlltheWeb
Google
MSE
Vivisimo
Yahoo
Avg. No. Relevant URLs
13.33
13.52
10.81
13.15
12.19
Pcnt. Of Relevant URLs
67%
68%
54%
66%
61%
Avg. Rank
10.56
10.33
10.32
10.32
10.39
RankPower
0.79
0.76
0.95
0.78
0.85
Revised RankPower
0.68
0.70
0.57
0.69
0.63
 
Search WWH ::




Custom Search