Building a Graph Database Application - Graph Databases

Databases Reference

In-Depth Information

Replication

Although all writes to a cluster are coordinated through the master, Neo4j does allow

writing through slaves, but even then, the slave that's being written to syncs with the

master before returning to the client. Because of the additional network traffic and

coordination protocol, writing through slaves can be an order of magnitude slower than

writing directly to the master. The only reasons for writing through slaves are to increase

the durability guarantees of each write (the write is made durable on two instances,

rather than one) and to ensure we can read our own writes when employing cache

sharding (see “Cache sharding” on page 80 and “Read your own writes” on page 82 later

in this chapter). Because newer versions of Neo4j enable us to specify that writes to the

master must be replicated out to one or more slaves, thereby increasing the durability

guarantees of writes to the master, the case for writing through slaves is now less com‐

pelling. Today it is recommended that all writes be directed to the master, and then

replicated to slaves using the ha.tx_push_factor and ha.tx_push_strategy configu‐

ration settings .

Buffer writes using queues

In high write load scenarios, we can use queues to buffer writes and regulate load. With

this strategy, writes to the cluster are buffered in a queue; a worker then polls the queue

and executes batches of writes against the database. Not only does this regulate write

traffic, but it reduces contention, and enables us to pause write operations without

refusing client requests during maintenance periods.

Global clusters

For applications catering to a global audience, it is possible to install a multiregion cluster

in multiple data centers and on cloud platforms such as Amazon Web Services (AWS).

A multiregion cluster enables us to service reads from the portion of the cluster geo‐

graphically closest to the client. In these situations, however, the latency introduced by

the physical separation of the regions can sometimes disrupt the coordination protocol;

it is, therefore, often desirable to restrict master reelection to a single region. To achieve

this, we create slave-only databases for the instances we don't want to participate in

master reelection; we do this by including the ha.slave_coordinator_up

date_mode=none configuration parameter in an instance's configuration.

Load Balancing

When using a clustered graph database, we should consider load balancing traffic across

the cluster to help maximize throughput and reduce latency. Neo4j doesn't include a

native load balancer, relying instead on the load-balancing capabilities of the network

infrastructure.

Search WWH ::

Custom Search

Home