Database Reference
In-Depth Information
Think of the Web as a directed graph, where pages are the nodes, and there is an arc from
page p 1 to page p 2 if there are one or more links from p 1 to p 2 . Figure 5.1 is an example of
a tiny version of the Web, where there are only four pages. Page A has links to each of the
other three pages; page B has links to A and D only; page C has a link only to A , and page
D has links to B and C only.
Figure 5.1 A hypothetical example of the Web
Suppose a random surfer starts at page A in Fig. 5.1 . There are links to B , C , and D , so
this surfer will next be at each of those pages with probability 1/3, and has zero probability
of being at A . A random surfer at B has, at the next step, probability 1/2 of being at A , 1/2
of being at D , and 0 of being at B or C .
In general, we can define the transition matrix of the Web to describe what happens to
random surfers after one step. This matrix M has n rows and columns, if there are n pages.
The element m ij in row i and column j has value 1/ k if page j has k arcs out, and one of them
is to page i . Otherwise, m ij = 0.
EXAMPLE 5.1 The transition matrix for the Web of Fig. 5.1 is
In this matrix, the order of the pages is the natural one, A , B , C , and D . Thus, the first
column expresses the fact, already discussed, that a surfer at A has a 1/3 probability of next
being at each of the other pages. The second column expresses the fact that a surfer at B
has a 1/2 probability of being next at A and the same of being at D . The third column says a
surfer at C is certain to be at A next. The last column says a surfer at D has a 1/2 probability
of being next at B and the same at C .
The probability distribution for the location of a random surfer can be described by a
column vector whose j th component is the probability that the surfer is at page j . This prob-
ability is the (idealized) PageRank function.
Suppose we start a random surfer at any of the n pages of the Web with equal probability.
Then the initial vector v 0 will have 1/ n for each component. If M is the transition matrix
of the Web, then after one step, the distribution of the surfer will be M v 0 , after two steps it
Search WWH ::




Custom Search