Databases Reference
In-Depth Information
Solving Linear Equations
If you look at the 4-node “Web” of Example 5.2, you might think that the
way to solve the equation v = M v is by Gaussian elimination. Indeed,
in that example, we argued what the limit would be essentially by doing
so. However, in realistic examples, where there are tens or hundreds of
billions of nodes, Gaussian elimination is not feasible. The reason is that
Gaussian elimination takes time that is cubic in the number of equations.
Thus, the only way to solve equations on this scale is to iterate as we
have suggested. Even that iteration is quadratic at each round, but we
can speed it up by taking advantage of the fact that the matrix M is very
sparse; there are on average about ten links per page, i.e., ten nonzero
entries per column.
Moreover, there is another difference between PageRank calculation
and solving linear equations. The equation v = M v has an infinite number
of solutions, since we can take any solution v, multiply its components by
any fixed constant c, and get another solution to the same equation. When
we include the constraint that the sum of the components is 1, as we have
done, then we get a unique solution.
get by multiplying at each step by M is:
2
3
2
3
2
3
2
3
2
3
1/4
1/4
1/4
1/4
9/24
5/24
5/24
5/24
15/48
11/48
11/48
11/48
11/32
7/32
7/32
7/32
3/9
2/9
2/9
2/9
4
5
4
5
4
5
4
5
4
5
Notice that in this example, the probabilities for B, C, and D remain the
same. It is easy to see that B and C must always have the same values at any
iteration, because their rows in M are identical. To show that their values are
also the same as the value for D, an inductive proof works, and we leave it as
an exercise. Given that the last three values of the limiting vector must be the
same, it is easy to discover the limit of the above sequence. The first row of
M tells us that the probability of A must be 3/2 the other probabilities, so the
limit has the probability of A equal to 3/9, or 1/3, while the probability for the
other three nodes is 2/9.
This difference in probability is not great. But in the real Web, with billions
of nodes of greatly varying importance, the true probability of being at a node
like www.amazon.com is orders of magnitude greater than the probability of
typical nodes.
2
Search WWH ::




Custom Search