Databases Reference
In-Depth Information
A
B
C
D
Figure 5.3: C is now a dead end
Note that it is substochastic, but not stochastic, because the sum of the third
column, for C, is 0, not 1. Here is the sequence of vectors that result by starting
with the vector with each component 1/4, and repeatedly multiplying the vector
by M :
2
3
2
3
2
3
2
3
2
3
1/4
1/4
1/4
1/4
3/24
5/24
5/24
5/24
5/48
7/48
7/48
7/48
21/288
31/288
31/288
31/288
0
0
0
0
4
5
4
5
4
5
4
5
4
5
As we see, the probability of a surfer being anywhere goes to 0, as the number
of steps increase.
2
There are two approaches to dealing with dead ends.
1. We can drop the dead ends from the graph, and also drop their incoming
arcs. Doing so may create more dead ends, which also have to be dropped,
recursively. However, eventually we wind up with a strongly-connected
component, none of whose nodes are dead ends. In terms of Fig. 5.2,
recursive deletion of dead ends will remove parts of the out-component,
tendrils, and tubes, but leave the SCC and the in-component, as well as
parts of any small isolated components. 4
2. We can modify the process by which random surfers are assumed to move
about the Web. This method, which we refer to as “taxation,” also solves
the problem of spider traps, so we shall defer it to Section 5.1.5.
If we use the first approach, recursive deletion of dead ends, then we solve the
remaining graph G by whatever means are appropriate, including the taxation
method if there might be spider traps in G. Then, we restore the graph, but keep
4 You might suppose that the entire out-component and all the tendrils will be removed, but
remember that they can have within them smaller strongly connected components, including
spider traps, which cannot be deleted.
Search WWH ::




Custom Search