Graph Database Internals - Graph Databases

Databases Reference

In-Depth Information

(assuming a positive response to the prepare phase) a commit entry will be written to

the log. This causes the log to be flushed to disk, thereby making the changes durable.

Once the disk flush has occurred, the changes are applied to the graph itself. After all

the changes have been applied to the graph, any write locks associated with the trans‐

action are released.

Once a transaction has committed, the system is in a state where changes are guaranteed

to be in the database even if a fault then causes a non-pathological failure. This, as we

shall now see, confers substantial advantages for recoverability, and hence for ongoing

provision of service.

Recoverability

Databases are no different from any other pieces of software in that they are susceptible

to bugs in their implementation, in the hardware they run on, and in that hardware's

power and cooling infrastructures. Though diligent engineers try to minimize the pos‐

sibility of failure in all of these, at some point it's inevitable that a database will crash—

though the mean time between failures should be very long indeed.

In a well-designed system, a database server crash, though annoying, ought not affect

availability, though it may affect throughput. And when a failed server resumes opera‐

tion, it must not serve corrupt data to its users, irrespective of the nature or timing of

the crash.

When recovering from an unclean shutdown, perhaps caused by a fault or even an

overzealous operator, Neo4j checks in the most recently active transaction log and re‐

plays any transactions it finds against the store. It's possible that some of those trans‐

actions may have already been applied to the store, but because replaying is an idem‐

potent action, the net result is the same: after recovery, the store will be consistent with

all transactions successfully committed prior to the failure.

Local recovery is all that is necessary in the case of a single database instance. Generally,

however, we run databases in clusters (which we'll discuss shortly) to assure high avail‐

ability on behalf of client applications. Fortunately, clustering confers additional benefits

to recovering instances: not only will an instance become consistent with all transactions

successfully committed prior to its failure, as discussed earlier, it can also quickly catch

up with other instances in the cluster, and thereby be consistent with all transactions

successfully committed subsequent to its failure. That is, once local recovery has com‐

pleted, a replica can ask other members of the cluster—typically the master—for any

newer transactions; it can then apply these newer transactions to its own dataset via

transaction replay.

Search WWH ::

Custom Search

Home