Database Reference
In-Depth Information
Exercises for Section 5.3
EXERCISE 5.3.1 Compute the topic-sensitive PageRank for the graph of Fig. 5.15 , assuming
the teleport set is:
(a) A only.
(b) A and C .
5.4 Link Spam
When it became apparent that PageRank and other techniques used by Google made term
spam ineffective, spammers turned to methods designed to fool the PageRank algorithm
into overvaluing certain pages. The techniques for artificially increasing the PageRank of a
page are collectively called link spam . In this section we shall first examine how spammers
create link spam, and then see several methods for decreasing the effectiveness of these
spamming techniques, including TrustRank and measurement of spam mass.
Architecture of a Spam Farm
A collection of pages whose purpose is to increase the PageRank of a certain page or pages
is called a spam farm . Figure 5.16 shows the simplest form of spam farm. From the point
of view of the spammer, the Web is divided into three parts:
(1) Inaccessible pages : the pages that the spammer cannot affect. Most of the Web is in
this part.
(2) Accessible pages : those pages that, while they are not controlled by the spammer, can
be affected by the spammer.
(3) Own pages : the pages that the spammer owns and controls.
Figure 5.16 The Web from the point of view of the link spammer
Search WWH ::

Custom Search