Databases Reference
In-Depth Information
A
B
C
D
Figure 5.6: A graph with a one-node spider trap
If we perform the usual iteration to compute the PageRank of the nodes, we
get
2
3
2
3
2
3
2
3
2
3
1/4
1/4
1/4
1/4
3/24
5/24
11/24
5/24
5/48
7/48
29/48
7/48
21/288
31/288
205/288
31/288
0
0
1
0
4
5
4
5
4
5
4
5
4
5
As predicted, all the PageRank is at C, since once there a random surfer can
never leave.
2
To avoid the problem illustrated by Example 5.5, we modify the calculation
of PageRank by allowing each random surfer a small probability of teleporting
to a random page, rather than following an out-link from their current page.
The iterative step, where we compute a new vector estimate of PageRanks v
from the current PageRank estimate v and the transition matrix M is
v = βM v + (1−β)e/n
where β is a chosen constant, usually in the range 0.8 to 0.9, e is a vector of all
1's with the appropriate number of components, and n is the number of nodes
in the Web graph. The term βM v represents the case where, with probability
β, the random surfer decides to follow an out-link from their present page. The
term (1−β)e/n is a vector each of whose components has value (1−β)/n and
represents the introduction, with probability 1−β, of a new random surfer at
a random page.
Note that if the graph has no dead ends, then the probability of introducing a
new random surfer is exactly equal to the probability that the random surfer will
decide not to follow a link from their current page. In this case, it is reasonable
to visualize the surfer as deciding either to follow a link or teleport to a random
page. However, if there are dead ends, then there is a third possibility, which
Search WWH ::




Custom Search