Database Reference
In-Depth Information
its predecessors have PageRanks computed. These predecessors are A and D . In Fig. 5.4 , A
has three successors, so it contributes 1/3 of its PageRank to C . Page D has two successors
in Fig. 5.4 , so it contributes half its PageRank to C . Thus, the PageRank of C is
Now we can compute the PageRank for E . That node has only one predecessor, C , and
C has only one successor. Thus, the PageRank of E is the same as that of C . Note that
the sums of the PageRanks exceed 1, and they no longer represent the distribution of a
random surfer. Yet they do represent decent estimates of the relative importance of the
pages.
5.1.5
Spider Traps and Taxation
As we mentioned, a spider trap is a set of nodes with no dead ends but no arcs out.
These structures can appear intentionally or unintentionally on the Web, and they cause the
PageRank calculation to place all the PageRank within the spider traps.
EXAMPLE 5.5 Consider Fig. 5.6 , which is Fig. 5.1 with the arc out of C changed to point to
C itself. That change makes C a simple spider trap of one node. Note that in general spider
traps can have many nodes, and as we shall see in Section 5.4 , there are spider traps with
millions of nodes that spammers construct intentionally.
Figure 5.6 A graph with a one-node spider trap
The transition matrix for Fig. 5.6 is
If we perform the usual iteration to compute the PageRank of the nodes, we get
As predicted, all the PageRank is at C , since once there a random surfer can never
leave.
Search WWH ::




Custom Search