Neo4j in production - Neo4j in Action

Database Reference

In-Depth Information

memory,dramaticallyincreasingperformancebyeliminatingtheneedforphysicaldiskIO.

Changes to the data are also written to the filesystem cache, rather than immediately to

the physical disk. The performance lift gained from accessing a file through the filesystem

cache, rather than going to a spinning disk, is around 500 times faster.

The OS is in charge of managing this memory, including making decisions about when to

flush changes written to this memory area down to the physical disk. Although the OS has

the final say as to when data is read into and written from this area, processes can request

certain files, or portions of files, to be loaded into this memory area for processing.

Neo4j takes advantage of this OS feature (through the Java NIO packages) to efficiently

load, read, and write to and from the store files. This provides the first level (sometimes

called the low-level ) caching functionality within Neo4j.

What happens if the system crashes and the data has only been written to memory

and not to disk?

IftheOSisincontrolofwhendataisflushedtodiskforthefilesystemcache,whathappens

to that data in the event of a system failure?

In short, data held in this memory area is lost. Fear not, however, as Neo4j makes use of a

separate, durable, transaction log to ensure that all transactions are physically written to a

file that is flushed to disk upon every commit. Whenever a commit happens, although the

store files themselves may not yet have physically been updated, the transaction log will

always have the data on disk. The transaction log (covered in section 11.1.5 ) can then be

used to recover and restore the system when starting up after a failure. In other words, this

transaction log can be used to reconstruct the store files so that they fully reflect what the

system looked like at the point of the last commit, and thus what the store files would have

looked like had the data been flushed to disk before the crash.

Configuring the filesystem cache

Ideally, you should aim to load as much of your persistent graph data (as stored within the

physical store files on disk) into memory as possible. You can control what parts, and how

much, of the persistent graph you want loaded into memory. In practice, as all graph data

livesinthestorefiles,thisgenerallyinvolveslookingathowmuchspacethesefilesoccupy

Search WWH ::

Custom Search

Home