Databases Reference
In-Depth Information
6.10.2
Using graphs and custom shared-memory hardware
to detect health care fraud
Graphs are valuable in situations where data discovery is required. Graphs can show
relationships between health care beneficiaries, their claims, associated care provid-
ers, tests performed, and other relevant data. Graph analytics search through the data
to find patterns of relationships between all of these entities that might indicate collu-
sion to commit fraud.
The graph representing Medicare data is large: it represents six million providers,
a hundred million patients, and billions of claim records. The graph data is intercon-
nected between health care providers, diagnostic tests, and common treatments asso-
ciated with each patient and their claim records. This amount of data can't be held in
the memory of a single server, and partitioning the data across multiple nodes in a
computing cluster isn't feasible. Attempts to do so may result in incomplete queries
due to all the links crossing partition boundaries, the need to page data in and out of
memory, and the delays added by slower network and storage speeds. Meanwhile,
fraud continues to occur at an alarming rate.
Medicare fraud analytics requires an in-memory graph solution that can merge
heterogeneous data from a variety of sources, use queries to find patterns, and dis-
cover similarities as well as exact matches. With every item of data loaded into mem-
ory, there's no need to contend with the issue of graph partitioning. The graph can be
dynamically updated with new data easily, and existing queries can integrate the new
data into the analytics being performed, making the discovery of hidden relationships
in the data feasible.
Figure 6.17 shows the high-level architecture of how shared-memory systems are
used to look for patterns in large graphs.
With these requirements in mind, a US federally funded lab with a mandate to
identify Medicare and Medicaid fraud deployed YarcData's Urika appliance. The
appliance is capable of scaling from 1-512 terabytes of memory, shared by up to 8,192
Figure 6.17 How large graphs are
loaded into a central shared-
memory structure. This example
shows a graph in a central multi-
terabyte RAM store with
potentially hundreds or thousands
of simultaneous threads in CPUs
performing queries on the graph.
Note that, like other NoSQL
systems, the data stays in RAM
while the analysis is processing.
Each CPU can perform an
independent query on the graph
without interfering with each other.
 
Search WWH ::




Custom Search