Database Reference
In-Depth Information
(a)
(b)
10 7
10 7
1978-04
1981-01
1985-07
1993-05
1999-12
1976-06
1977-03
1977-06
1977-08
1978-04
10 6
10 6
10 5
10 5
10 4
10 4
Expansion
10 3
10 3
10 2
10 2
10 1
10 1
10 0
10 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 070 0
Radius
Radius
(c)
(d)
2003-05
2003-06
2003-07
2003-12
2003-12
2003-08
2004-12
2005-12
2006-08
10 8
10 8
Contraction
10 6
10 6
Expansion
10 4
10 4
10 2
10 2
10 0
10 0
02468
Radius
10
12
14
16
18
0
2468
Radius
10
12
14
16
18
FIGURE 8.23 Radius distribution over time for (a), (b) U.S. Patent and (c), (d) LinkedIn
graphs. “Expansion”: the radius distribution moves to the right until the gelling point.
“Contraction”: the radius distribution moves to the left after the gelling point.
8.7 CONCLUSION
In this chapter, we presented PEGASUS [29], a graph mining package for very large
graphs using the Hadoop architecture. Since the introduction of PEGASUS, many
other graph-processing systems have been proposed [3,34,36]. The PEGASUS source
code can be downloaded from http://www.cs.cmu.edu/~pegasus/ . We illus-
trated case studies where PEGASUS allows us to obtain insights from large-scale
graphs, such as the important case of the web graph. Finding other novel applications
of the tools developed is left as an interesting direction to the interested practitioners.
We summarize the contributions of PEGASUS:
We identified the common, underlying primitive of several graph mining
operations, and we showed that it is a generalized form of a matrix-vector
multiplication. We call this operation generalized iterative matrix-vector
multiplication and showed that it includes the diameter estimation, the
PageRank estimation, RWR calculation, and finding connected compo-
nents, as special cases.
Given its importance, we proposed several optimizations (block multipli-
cation, diagonal block iteration, node renumbering, etc.) and reported the
Search WWH ::




Custom Search