Database Reference
In-Depth Information
5.1.6
Using PageRank in a Search Engine
Having seen how to calculate the PageRank vector for the portion of the Web that a search
engine has crawled, we should examine how this information is used. Each search engine
has a secret formula that decides the order in which to show pages to the user in response to
a search query consisting of one or more search terms (words). Google is said to use over
250 different properties of pages, from which a linear order of pages is decided.
First, in order to be considered for the ranking at all, a page has to have at least one of
the search terms in the query. Normally, the weighting of properties is such that unless all
the search terms are present, a page has very little chance of being in the top ten that are
normally shown first to the user. Among the qualified pages, a score is computed for each,
and an important component of this score is the PageRank of the page. Other components
include the presence or absence of search terms in prominent places, such as headers or the
links to the page itself.
5.1.7
Exercises for Section 5.1
EXERCISE 5.1.1 Compute the PageRank of each page in Fig. 5.7 , assuming no taxation.
Figure 5.7 An example graph for exercises
EXERCISE 5.1.2 Compute the PageRank of each page in Fig. 5.7 , assuming β = 0 . 8.
! EXERCISE 5.1.3 Suppose the Web consists of a clique (set of nodes with all possible arcs
from one to another) of n nodes and a single additional node that is the successor of each
of the n nodes in the clique. Figure 5.8 shows this graph for the case n = 4. Determine the
PageRank of each page, as a function of n and β .
Figure 5.8 Example of graphs discussed in Exercise 5.1.3
Search WWH ::




Custom Search