Database Reference
In-Depth Information
Data replication and data locality
Neo4j HA architecture asynchronously replicates the data to other nodes in a cluster. All
write operations are first performed by the master node and then the slave nodes are syn-
chronized or they poll the new data from the last checkpoint from the master node. The be-
havior of data replication is driven by the following properties defined at
<$NEO4J_HOME>/conf/neo4j.properties :
ha.pull_interval : This is the interval at which slaves will pull updates from
the master. The unit is in seconds.
ha.tx_push_factor : This is the amount of slaves node the master will try to
push a transaction before returning success to the client. We can also set this to 0,
which will switch off the synchronous data writes to slave node and would eventu-
ally increase the write performance, but would also increase the risk of data loss, as
the master would be the only node containing the transaction.
ha.tx_push_strategy : It should be either fixed or round robin. This means
the priority of nodes that will be selected to push the events. In case of fixed, the
priority is decided based on the value of ha.server_id , which is further based
on the principle of highest first.
All write transactions on a slave will be first synchronized with the master. When the trans-
action commits, it will be first committed on the master, and if successful, then it will be
committed on the slave. To ensure consistency, the slave has to be updated and synchron-
ized with the master before performing a write operation. This is built into the communica-
tion protocol between the slave and the master so that updates are applied automatically to
a slave node communicating with its master node.
Neo4j provides full data replication on each node so that each node is self-sufficient to
serve read/write requests. It also helps in achieving low latency. In order to serve the global
audience, additional Neo4j servers can be configured as read-only slave servers and these
servers can be placed near the customer (maybe geographically). These slave read-only
servers are synced up with the master in real time and all the local read requests are direc-
ted and served by these read-only slave servers, which provide data locality for our client
applications.
Search WWH ::




Custom Search