Database Reference
In-Depth Information
10 9
10 8
10 7
10 6
10 5
10 4
10 3
10 2
10 1
10 0
10 0 10 1 10 2 10 3 10 4 10 5
Size
YahooWeb
Spikes
Giant
connected
component
10 6 10 7 10 8 10 9
FIGURE 8.12 Connected components of YahooWeb. Notice the two anomalous spikes that
are far from the constant-slope line. Most of them are domain selling or porn sites, which are
replicated from templates.
8.6.1.2 Absorbed Connected Components and Dunbar's Number
The size of the giant component keeps growing while the second and third largest
connected components do not grow beyond size 100, until they are absorbed from
the giant component. This does not surprise us, since had we had two giant compo-
nents it is not unlikely that some new vertex becomes connected to both.
8.6.1.3 “Anomalous” Connected Components
In Figure 8.12, we see two spikes. In the first spike at size 300, more than half of
the components have exactly the same structure and were made from a domain-
selling company where each component represents a domain to be sold. The
spike happened because the company replicated sites using the same template,
and injected the disconnected components into a WWW network. In the second
spike at size 1101, more than 80% of the components are porn sites disconnected
from the giant connected component. In general, by looking carefully at the
distribution plot of connected components, we were able to detect interesting
communities with special purposes, which are disconnected from the rest of the
Internet.
8.6.2 P age r ank sCores oF r eal -w orlD n etworks
We analyze the PageRank scores of real graphs, using PEGASUS. Figures 8.13 and 8.14
show the distribution of the PageRank scores for theweb graphs, and Figure 8.15 shows
the evolution of PageRank scores for the LinkedIn and Wikipedia graphs. We observe
power-law relations between the PageRank score and the number of vertices with such
PageRank. The top 3 highest PageRank sites for the year 2002 are www.career
bank.com , access.adobe.com , and top100.rambler.ru . As expected, they
have huge in-degrees (from ≈70K to ≈70M).
Search WWH ::




Custom Search