Database Reference
In-Depth Information
Think of the Web as a directed graph, where pages are the nodes, and there is an arc from
a tiny version of the Web, where there are only four pages. Page
A
has links to each of the
other three pages; page
B
has links to
A
and
D
only; page
C
has a link only to
A
, and page
D
has links to
B
and
C
only.
Figure 5.1
A hypothetical example of the Web
Suppose a random surfer starts at page
A
in
Fig. 5.1
. There are links to
B
,
C
, and
D
, so
this surfer will next be at each of those pages with probability 1/3, and has zero probability
of being at
A
. A random surfer at
B
has, at the next step, probability 1/2 of being at
A
, 1/2
of being at
D
, and 0 of being at
B
or
C
.
In general, we can define the
transition matrix of the Web
to describe what happens to
random surfers after one step. This matrix
M
has
n
rows and columns, if there are
n
pages.
The element
m
ij
in row
i
and column
j
has value 1/
k
if page
j
has
k
arcs out, and one of them
is to page
i
. Otherwise,
m
ij
= 0.
EXAMPLE
5.1 The transition matrix for the Web of
Fig. 5.1
is
In this matrix, the order of the pages is the natural one,
A
,
B
,
C
, and
D
. Thus, the first
column expresses the fact, already discussed, that a surfer at
A
has a 1/3 probability of next
being at each of the other pages. The second column expresses the fact that a surfer at
B
has a 1/2 probability of being next at
A
and the same of being at
D
. The third column says a
surfer at
C
is certain to be at
A
next. The last column says a surfer at
D
has a 1/2 probability
of being next at
B
and the same at
C
.
□
The probability distribution for the location of a random surfer can be described by a
column vector whose
j
th component is the probability that the surfer is at page
j
. This prob-
ability is the (idealized)
PageRank
function.
Suppose we start a random surfer at any of the
n
pages of the Web with equal probability.
Then the initial vector
v
0
will have 1/
n
for each component. If
M
is the transition matrix
of the Web, then after one step, the distribution of the surfer will be
M
v
0
, after two steps it