Databases Reference
In-Depth Information
Recoverability deals with the capability of the database to set things right after a fault
has arisen. In addition to recoverability, a good database needs to be highly available to
meet the increasingly sophisticated needs of data-heavy applications.
Availability
In addition to being valuable in and of themselves, Neo4j's transaction and recovery
capabilities also benefit its high-availability characteristics; its ability to recognize and,
if necessary, repair an instance after crashing means that data quickly becomes available
again without human intervention. And of course, more live instances increases the
overall availability of the database to process queries.
It's uncommon to want individual disconnected database instances in a typical pro‐
duction scenario. More often, we cluster database instances for high availability. Neo4j
uses a master-slave cluster arrangement to ensure a complete replica of the graph is
stored on each machine. Writes are replicated out from the master to the slaves at fre‐
quent intervals. At any point, the master and some slaves will have a completely up-to-
date copy of the graph, while other slaves will be catching up (typically, they will be but
milliseconds behind).
For writes, the classic write-master with read-slaves is a popular topology. With this
setup, all database writes are directed at the master, and read operations are directed at
slaves. This provides asymptotic scalability for writes (up to the capacity of a single
spindle) but allows for near linear scalability for reads (accounting for the modest over‐
head in managing the cluster).
Although write-master with read-slaves is a classic deployment topology, Neo4j also
supports writing through slaves. In this scenario, the slave to which a write has been
directed by the client first ensures it is consistent with the master (it “catches up”);
thereafter, the write is synchronously transacted across both instances. This is useful
when we want immediate durability in two database instances; furthermore, because it
allows writes to be directed to any instance, it offers additional deployment flexibility.
It comes at the cost of higher write latency, however, owing to the forced catchup phase.
It does not imply that writes are distributed around the system: all writes must still pass
through the master at some point.
Other Replication Options in Neo4j
In Neo4j version 1.8 onward it's possible to specify that writes to the master are replicated
in a best-effort manner to an arbitrary number of replicas before a transaction is con‐
sidered complete. This provides an alternative to the “at least two” level of durability
achieved by writing through slaves. See “Replication” on page 79 for more details.
 
Search WWH ::




Custom Search