Database Reference
In-Depth Information
the WWW graph is the computation of web pages' PageRank [63] for web searches.
Let the engineering details such as damping factor alone, the PageRank algorithm
iteratively computes the PageRank of each page from the PageRanks of the pages
that link to it. The algorithm terminates when the PageRanks of the pages converge.
The PageRank algorithm is often used as an example to illustrate performance char-
acteristics of cloud-based platforms.
7.2.3 i nFormation n etworks
Resource Description Framework (RDF) has been an official W3C recommendation for
the semantic web. The triplets of RDF naturally form a graph. Among others, RDF has
been applied to knowledge bases, such as DBpedia [6]. The ontology of DBpedia derived
from Wikipedia contains 3.7 millions of “things” and 400 millions of facts.* Such data
are particularly useful for users to formulate complex queries about the information rep-
resented in the RDF. Applications of the semantic web continue to emerge each year [1].
Search engine providers are actively engaged in introducing semantics for next
generation search engines (e.g., Probase [78]). A recent report of the graph-based
knowledge base Satori [13] from Microsoft, which enhances the search capabilities
of Bing, consists of more than 300 million nodes and 800 million edges. Google's
knowledge graph has 570 million objects and 18 billion facts about the relationships
between different objects. The knowledge graphs are expected to enhance the rank-
ing mechanisms of search results.
7.2.4 m isCellaneous
Other examples of large graphs are the citation relationship of research articles, rela-
tionships between US patents, Wordnet, § communication networks, transportation
or road networks, and many others. Some of these graphs can be found in the a nice
collection of graphs of the Stanford Network Analysis Project (SNAP) [53].
7.3 CLOUD-BASED GRAPH PROCESSING PLATFORMS
As described in the previous section, graph data are ubiquitous and their volume is
ever increasing. New computationally and data-intensive analysis tasks on graphs
are continuously being reported. The deployments of applications on such data have
been moving from a small number of high-performance servers or super computers
[31,46] toward a cloud with a large number of commodity servers [43,58].
A number of general-purpose development platforms such as MapReduce [23], its
open-source variant, Hadoop [33], and Dryad [37] have been proposed to help users
to develop custom applications on the cloud, without worrying about the complexity
beneath the cloud. For instance, data may be stored in distributed and replicated file
* DBpedia SPARQL Benchmark: http://aksw.org/Projects/DBPSB.html.
Probase: http://research.microsoft.com/probase/.
US Patent: http://vlado.fmf.uni-lj.si/pub/networks/data/patents/Patents.htm.
§ Wordnet: http://vlado.fmf.uni-lj.si/pub/networks/data/dic/Wordnet/Wordnet.zip.
Search WWH ::




Custom Search