Big Data Computing Applications - Guide to Cloud Computing for Business and Technology Managers

Information Technology Reference

In-Depth Information

21.4.4 Graph Stores or Databases

Social media and the emergence of Facebook, LinkedIn, and Twitter have

accelerated the emergence of the most complex NoSQL database, the graph

database. The graph database is oriented toward modeling and deploying

data that is graphical by construct. For example, to represent a person and

their friends in a social network, we can either write code to convert the

social graph into key-value pairs on a Dynamo or Cassandra or simply con-

vert them into a node-edge model in a graph database, where managing the

relationship representation is much more simplified.

A graph database represents each object as a node and the relationships

as an edge. This means person is a node and household is a node and the

relationship between them is an edge. Like the classic ER model for RDBMS,

we need to create an attribute model for a graph database. We can start by

taking the highest level in a hierarchy as a root node (similar to an entity)

and connect each attribute as its subnode. To represent different levels of the

hierarchy, we can add a subcategory or subreference and create another list

of attributes at that level. This creates a natural traversal model like a tree

traversal, which is similar to traversing a graph. Depending on the cyclic

property of the graph, we can have a balanced or skewed model. Some of the

most evolved graph databases include Neo4J, InfiniteGraph, GraphDB, and

AllegroGraph.

21.4.5 Comparison of NoSQL Databases

1. Column-based databases allow for rapid location and return of data

from one particular attribute. They are potentially very slow with

writing, however, since data may need to be shuffled around to

allow a new data item to be inserted. As a rough guide then, tradi-

tional transactionally oriented databases will probably fair better in

an RDBMS. Column based will probably thrive in areas where speed

of access to nonvolatile data is important, for example, in some deci-

sion support applications. You only need to review marketing mate-

rial from commercial contenders, like Ingres Vectorwise, to see that

business analytics is seen as the key market and speed of data access

the main product differentiator.

2. If you do not need large and complex data structures and can always

access your data using a known key, then key-value stores have a

performance advantage over most RDBMS. Oracle has a feature

within their RDBMS that allows you to define a table at an index-

organized table (IOT), and this works in a similar way. However,

you do still have the overhead of consistency checking, and these

IOTs are often just a small part of a larger schema. RDBMS have a

reputation for poor scaling in distributed systems, and this is where

key-value stores can be a distinct advantage.

Search WWH ::

Custom Search

Home