Database Reference
In-Depth Information
proof works, and we leave it as an exercise. Given that the last three values of the limiting
vector must be the same, it is easy to discover the limit of the above sequence. The first
row of M tells us that the probability of A must be 3/2 the other probabilities, so the limit
has the probability of A equal to 3/9, or 1/3, while the probability for the other three nodes
is 2/9.
Solving Linear Equations
If you look at the 4-node “Web” of Example 5.2 , you might think that the way to solve the equation v = M v is by
Gaussian elimination. Indeed, in that example, we argued what the limit would be essentially by doing so. However,
in realistic examples, where there are tens or hundreds of billions of nodes, Gaussian elimination is not feasible. The
reason is that Gaussian elimination takes time that is cubic in the number of equations. Thus, the only way to solve
equations on this scale is to iterate as we have suggested. Even that iteration is quadratic at each round, but we can
speed it up by taking advantage of the fact that the matrix M is very sparse; there are on average about ten links per
page, i.e., ten nonzero entries per column.
Moreover, there is another difference between PageRank calculation and solving linear equations. The equation v =
M v has an infinite number of solutions, since we can take any solution v , multiply its components by any fixed con-
stant c , and get another solution to the same equation. When we include the constraint that the sum of the components
is 1, as we have done, then we get a unique solution.
This difference in probability is not great. But in the real Web, with billions of nodes
of greatly varying importance, the true probability of being at a node like
www.amazon.com is orders of magnitude greater than the probability of typical
nodes.
5.1.3
Structure of the Web
It would be nice if the Web were strongly connected like Fig. 5.1 . However, it is not, in
practice. An early study of the Web found it to have the structure shown in Fig. 5.2 . There
was a large strongly connected component (SCC), but there were several other portions that
were almost as large.
Search WWH ::




Custom Search