Database Reference
In-Depth Information
Note
Enable high-performance cache by modifying cache_type=hpc in
<$NEO4J_HOME>/conf/neo4j.properties .
In the Neo4j world, we consider large data, large enough that it cannot fit into the
provided memory (RAM), so the next option would be to introduce shards and distribute
these shards to individual nodes.
Sharding data and then caching it on individual nodes is a reasonable and scalable solu-
tion, but it is difficult to shard the graphs with a traditional sharding approach and it may
not scale for real-time transactions too. That's the reason there is no utility/API provided
in Neo4j to shard the data.
So what's next???
The answer is cache-based sharding.
In cache-based sharding, all nodes in a cluster contain the full data, but we partition the
type of requests served by each database instance to increase the likelihood of hitting a
warm cache for a given request. Warm caches in Neo4j are ridiculously high in perform-
ance, especially the HPC—high-performance cache.
In short, we would recommend a routing strategy that routes the user read requests in such
a manner that they are always served by a specific set of nodes in a cluster.
The strategy could be based on the domain or may be based on the specific type of query
or characteristics of data. We could also use sticky sessions where the first and subsequent
request is served by the same node.
In any case, we need to ensure that majority of READ requests are served by warm cache
and not by the disk.
Search WWH ::




Custom Search