Databases Reference
In-Depth Information
is that the surfer goes nowhere. Since the term (1−β)e/n does not depend on
the sum of the components of the vector v, there will always be some fraction
of a surfer operating on the Web. That is, when there are dead ends, the sum
of the components of v may be less than 1, but it will never reach 0.
Example 5.6 : Let us see how the new approach to computing PageRank
fares on the graph of Fig. 5.6. We shall use β = 0.8 in this example. Thus, the
equation for the iteration becomes
2
3
2
3
0
2/5
0
0
1/20
1/20
1/20
1/20
4
5
4
5
4/15
0
0
2/5
v =
v +
4/15
0
4/5
2/5
4/15
2/5
0
0
Notice that we have incorporated the factor β into M by multiplying each of
its elements by 4/5. The components of the vector (1−β)e/n are each 1/20,
since 1−β = 1/5 and n = 4. Here are the first few iterations:
2
3
2
3
2
3
2
3
2
3
1/4
1/4
1/4
1/4
9/60
13/60
25/60
13/60
41/300
53/300
153/300
53/300
543/4500
707/4500
2543/4500
707/4500
15/148
19/148
95/148
19/148
4
5
4
5
4
5
4
5
4
5
By being a spider trap, C has managed to get more than half of the PageRank
for itself. However, the effect has been limited, and each of the nodes gets some
of the PageRank.
2
5.1.6 Using PageRank in a Search Engine
Having seen how to calculate the PageRank vector for the portion of the Web
that a search engine has crawled, we should examine how this information is
used. Each search engine has a secret formula that decides the order in which
to show pages to the user in response to a search query consisting of one or
more search terms (words). Google is said to use over 250 different properties
of pages, from which a linear order of pages is decided.
First, in order to be considered for the ranking at all, a page has to have at
least one of the search terms in the query. Normally, the weighting of properties
is such that unless all the search terms are present, a page has very little chance
of being in the top ten that are normally shown first to the user. Among the
qualified pages, a score is computed for each, and an important component of
this score is the PageRank of the page. Other components include the presence
or absence of search terms in prominent places, such as headers or the links to
the page itself.
5.1.7 Exercises for Section 5.1
Exercise 5.1.1 : Compute the PageRank of each page in Fig. 5.7, assuming
no taxation.
Search WWH ::




Custom Search