Database Reference
In-Depth Information
To avoid the problem illustrated by Example 5.5 , we modify the calculation of PageRank
by allowing each random surfer a small probability of teleporting to a random page, rather
than following an out-link from their current page. The iterative step, where we compute a
new vector estimate of PageRanks v ′ from the current PageRank estimate v and the trans-
ition matrix M is
v ′ = βM v + (1 − β ) e / n
where β is a chosen constant, usually in the range 0.8 to 0.9, e is a vector of all 1s with
the appropriate number of components, and n is the number of nodes in the Web graph.
The term βM v represents the case where, with probability β , the random surfer decides to
follow an out-link from their present page. The term (1 − β ) e / n is a vector each of whose
components has value (1 − β )/ n and represents the introduction, with probability 1 − β , of
a new random surfer at a random page.
Note that if the graph has no dead ends, then the probability of introducing a new random
surfer is exactly equal to the probability that the random surfer will decide not to follow a
link from their current page. In this case, it is reasonable to visualize the surfer as deciding
either to follow a link or teleport to a random page. However, if there are dead ends, then
there is a third possibility, which is that the surfer goes nowhere. Since the term (1 − β ) e / n
does not depend on the sum of the components of the vector v , there will always be some
fraction of a surfer operating on the Web. That is, when there are dead ends, the sum of the
components of v may be less than 1, but it will never reach 0.
EXAMPLE 5.6 Let us see how the new approach to computing PageRank fares on the graph
of Fig. 5.6 . We shall use β = 0 . 8 in this example. Thus, the equation for the iteration be-
comes
Notice that we have incorporated the factor β into M by multiplying each of its elements
by 4/5. The components of the vector (1 − β ) e / n are each 1/20, since 1 − β = 1/5 and n = 4.
Here are the first few iterations:
By being a spider trap, C has managed to get more than half of the PageRank for
itself. However, the effect has been limited, and each of the nodes gets some of the
PageRank.
Search WWH ::




Custom Search