Using NoSQL to manage big data - Making Sense of NoSQL

Databases Reference

In-Depth Information

Inputs

Urika graph appliance

Outputs

RDF

Visualization tools

Figure 6.18 Interacting with the

Urika graph analytics appliance.

Users load RDF data into the

system and then send graph

queries using SPARQL. The results

of these queries are then sent to

tools that allow an analyst to view

graphs or generate reports.

Service nodes

SPARQL

Dashboard (alerts, reports)

RDF

Accelerator

nodes

graph accelerator CPU s. It's worth noting that these graph accelerator CPU s were pur-

posely built for the challenges of graph analytics, and are instrumental in enabling

Urika to deliver two to four orders of magnitude better performance than conven-

tional clusters.

The impact of this performance is impressive. Interactive responses to queries

become the norm, with responses in seconds instead of days. That's important

because when queries reveal unexpected relationships, analysts can, within minutes,

modify their searches to leverage the findings and uncover additional evidence. Dis-

covery is about finding unknown relationships, and this requires the ability to quickly

test new hypotheses.

Now let's see how users can interact with a typical graph appliance. Figure 6.18

shows how data is moved into a graph appliance like Urika and how outputs can be

visualized by a user.

The software stack of the appliance leverages the RDF and SPARQL W3C standards

for graphs, which facilitates the import and integration of data from multiple sources.

The visualization and dashboard tools required for fraud analysis have their own

unique requirements, so the appliance's ability to quickly and easily integrate custom

visualization and dashboards is key to rapid deployment.

Medicare fraud analytics is similar to financial fraud analysis, or the search for per-

sons of interest in counter-terrorism or law enforcement agencies, where the discovery

of unknown or hidden relationships in the data can lead to substantial financial or

safety benefits.

6.11

Summary

In this chapter, we reviewed the ability of NoSQL systems to handle big data prob-

lems using many processors. It's clear that moving from a single CPU to distributed

database systems adds new management challenges that must be considered. Luckily,

most NoSQL systems are designed with distributed processing in mind. They use

techniques to spread the computing load evenly among hundreds or even thousands

of nodes.

The problems of large datasets that need rapid analysis won't go away. Barring an

event like the zombie apocalypse, big data problems will continue to grow at exponen-

tial rates. As long as people continue to create and share data, the need to quickly

Making Sense of NoSQL

Search WWH ::

Custom Search

Home