NoSQL data architecture patterns - Making Sense of NoSQL

Databases Reference

In-Depth Information

Let's take a look and see the variations on

terms used to describe different types of graphs.

As you use the web, you'll often see links on a

page that take you to another page; these links can

be represented by a graph or triple. The current

web page is the first or source node, the link is the

arc that “points to” the second page, and the sec-

ond or destination page is the second node. In this

example, the first node is represented by the URL

of the source page and the second node or desti-

nation is the URL of the destination page. This

linking process can be found in many places on

the web, from page links to wiki sites, where each

source and destination node is a page URL .

Figure 4.11 is an example of a graph store that has

a web page that links to other web pages.

The concept of using URL s to identify nodes is appealing since it's human readable

and provides a structure within the URL . The W3C generalized this structure to store

the information about the links between pages as well as the links between objects into

a standard called Resource Description Format , more commonly known as RDF .

Source web page

Destination web page

Figure 4.11 An example of using a

graph store to represent a web page

that contains links to two other web

pages. The URL of the source web

page is stored as a URL property and

each link is a relationship that has a

“points to” property. Each link is

represented as another node with a

property that contains the

destination page's URL.

4.2.2

Linking external data with the RDF standard

In a general-purpose graph store, you can create your own method to determine

whether two nodes reference the same point in a graph. Most graph stores will assign

internal ID s to each node as they load these nodes into RAM . The W3C has focused on

a process of using URL -like identifiers called uniform resource identifiers ( URI s) to create

explicit node identifiers for each node. This standard is called the W3C Resource

Description Format ( RDF ) .

RDF was specifically created to join together external datasets created by different

organizations. Conceptually, you can load two external datasets into one graph store

and then perform graph queries on this joined database. The trick is knowing when

two nodes reference the same object. RDF uses directed graphs, where the relation-

ship specifically points from a source node to a des-

tination node. The terminology for the source,

link, and destination may vary based on your situa-

tion, but in general the terms subject , predicate , and

object are used, as shown in figure 4.12.

These terms come from formal logic systems

and language. This terminology for describing how

nodes are identified has been standardized by the

W3C in their RDF standard. In RDF each node-arc-

node relationship is called a triple and is associated

Predicate

Subject

Object

Figure 4.12 How RDF uses specific

names for the general node-

relationship-node structure. The

source node is the subject, and the

destination node is the object. The

relationship that connects them

together is the predicate. The entire

structure is called an assertion.

Making Sense of NoSQL

Search WWH ::

Custom Search

Home