Databases Reference
In-Depth Information
(assuming a positive response to the prepare phase) a commit entry will be written to
the log. This causes the log to be flushed to disk, thereby making the changes durable.
Once the disk flush has occurred, the changes are applied to the graph itself. After all
the changes have been applied to the graph, any write locks associated with the trans‐
action are released.
Once a transaction has committed, the system is in a state where changes are guaranteed
to be in the database even if a fault then causes a non-pathological failure. This, as we
shall now see, confers substantial advantages for recoverability, and hence for ongoing
provision of service.
Recoverability
Databases are no different from any other pieces of software in that they are susceptible
to bugs in their implementation, in the hardware they run on, and in that hardware's
power and cooling infrastructures. Though diligent engineers try to minimize the pos‐
sibility of failure in all of these, at some point it's inevitable that a database will crash—
though the mean time between failures should be very long indeed.
In a well-designed system, a database server crash, though annoying, ought not affect
availability, though it may affect throughput. And when a failed server resumes opera‐
tion, it must not serve corrupt data to its users, irrespective of the nature or timing of
the crash.
When recovering from an unclean shutdown, perhaps caused by a fault or even an
overzealous operator, Neo4j checks in the most recently active transaction log and re‐
plays any transactions it finds against the store. It's possible that some of those trans‐
actions may have already been applied to the store, but because replaying is an idem‐
potent action, the net result is the same: after recovery, the store will be consistent with
all transactions successfully committed prior to the failure.
Local recovery is all that is necessary in the case of a single database instance. Generally,
however, we run databases in clusters (which we'll discuss shortly) to assure high avail‐
ability on behalf of client applications. Fortunately, clustering confers additional benefits
to recovering instances: not only will an instance become consistent with all transactions
successfully committed prior to its failure, as discussed earlier, it can also quickly catch
up with other instances in the cluster, and thereby be consistent with all transactions
successfully committed subsequent to its failure. That is, once local recovery has com‐
pleted, a replica can ask other members of the cluster—typically the master—for any
newer transactions; it can then apply these newer transactions to its own dataset via
transaction replay.
 
Search WWH ::




Custom Search