Databases Reference
In-Depth Information
Inputs
Urika graph appliance
Outputs
RDF
Visualization tools
Figure 6.18 Interacting with the
Urika graph analytics appliance.
Users load RDF data into the
system and then send graph
queries using SPARQL. The results
of these queries are then sent to
tools that allow an analyst to view
graphs or generate reports.
Service nodes
SPARQL
Dashboard (alerts, reports)
RDF
Accelerator
nodes
graph accelerator CPU s. It's worth noting that these graph accelerator CPU s were pur-
posely built for the challenges of graph analytics, and are instrumental in enabling
Urika to deliver two to four orders of magnitude better performance than conven-
tional clusters.
The impact of this performance is impressive. Interactive responses to queries
become the norm, with responses in seconds instead of days. That's important
because when queries reveal unexpected relationships, analysts can, within minutes,
modify their searches to leverage the findings and uncover additional evidence. Dis-
covery is about finding unknown relationships, and this requires the ability to quickly
test new hypotheses.
Now let's see how users can interact with a typical graph appliance. Figure 6.18
shows how data is moved into a graph appliance like Urika and how outputs can be
visualized by a user.
The software stack of the appliance leverages the RDF and SPARQL W3C standards
for graphs, which facilitates the import and integration of data from multiple sources.
The visualization and dashboard tools required for fraud analysis have their own
unique requirements, so the appliance's ability to quickly and easily integrate custom
visualization and dashboards is key to rapid deployment.
Medicare fraud analytics is similar to financial fraud analysis, or the search for per-
sons of interest in counter-terrorism or law enforcement agencies, where the discovery
of unknown or hidden relationships in the data can lead to substantial financial or
safety benefits.
6.11
Summary
In this chapter, we reviewed the ability of NoSQL systems to handle big data prob-
lems using many processors. It's clear that moving from a single CPU to distributed
database systems adds new management challenges that must be considered. Luckily,
most NoSQL systems are designed with distributed processing in mind. They use
techniques to spread the computing load evenly among hundreds or even thousands
of nodes.
The problems of large datasets that need rapid analysis won't go away. Barring an
event like the zombie apocalypse, big data problems will continue to grow at exponen-
tial rates. As long as people continue to create and share data, the need to quickly
Search WWH ::




Custom Search