Databases Reference
In-Depth Information
Let's take a look and see the variations on
terms used to describe different types of graphs.
As you use the web, you'll often see links on a
page that take you to another page; these links can
be represented by a graph or triple. The current
web page is the first or source node, the link is the
arc that “points to” the second page, and the sec-
ond or destination page is the second node. In this
example, the first node is represented by the URL
of the source page and the second node or desti-
nation is the URL of the destination page. This
linking process can be found in many places on
the web, from page links to wiki sites, where each
source and destination node is a page URL .
Figure 4.11 is an example of a graph store that has
a web page that links to other web pages.
The concept of using URL s to identify nodes is appealing since it's human readable
and provides a structure within the URL . The W3C generalized this structure to store
the information about the links between pages as well as the links between objects into
a standard called Resource Description Format , more commonly known as RDF .
Source web page
Destination web page
Figure 4.11 An example of using a
graph store to represent a web page
that contains links to two other web
pages. The URL of the source web
page is stored as a URL property and
each link is a relationship that has a
“points to” property. Each link is
represented as another node with a
property that contains the
destination page's URL.
4.2.2
Linking external data with the RDF standard
In a general-purpose graph store, you can create your own method to determine
whether two nodes reference the same point in a graph. Most graph stores will assign
internal ID s to each node as they load these nodes into RAM . The W3C has focused on
a process of using URL -like identifiers called uniform resource identifiers ( URI s) to create
explicit node identifiers for each node. This standard is called the W3C Resource
Description Format ( RDF ) .
RDF was specifically created to join together external datasets created by different
organizations. Conceptually, you can load two external datasets into one graph store
and then perform graph queries on this joined database. The trick is knowing when
two nodes reference the same object. RDF uses directed graphs, where the relation-
ship specifically points from a source node to a des-
tination node. The terminology for the source,
link, and destination may vary based on your situa-
tion, but in general the terms subject , predicate , and
object are used, as shown in figure 4.12.
These terms come from formal logic systems
and language. This terminology for describing how
nodes are identified has been standardized by the
W3C in their RDF standard. In RDF each node-arc-
node relationship is called a triple and is associated
Predicate
Subject
Object
Figure 4.12 How RDF uses specific
names for the general node-
relationship-node structure. The
source node is the subject, and the
destination node is the object. The
relationship that connects them
together is the predicate. The entire
structure is called an assertion.
Search WWH ::




Custom Search