Replication - MongoDB in Action

Database Reference

In-Depth Information

 The allotted hardware can't process the given workload. As an example, I men-

tioned working sets in the previous chapter. If your working data set is much

larger than the available RAM , then sending random reads to the secondaries is

still likely to result in excessive disk access, and thus slow queries.

 The ratio of writes to reads exceeds 50%. This is an admittedly arbitrary ratio,

but it's a reasonable place to start. The issue here is that every write to the pri-

mary must eventually be written to all the secondaries as well. Therefore direct-

ing reads to secondaries that are already processing a lot of writes can

sometimes slow the replication process and may not result in increased read

throughput.

 The application requires consistent reads. Secondary nodes replicate asynchro-

nously and therefore aren't guaranteed to reflect the latest writes to the pri-

mary node. In pathological cases, secondaries can run hours behind.

So you can balance read load with replication, but only in special cases. If you need to

scale and any of the preceding conditions apply, then you'll need a different strategy,

involving sharding, augmented hardware, or some combination of the two.

8.2

Replica sets

Replica sets are a refinement on master-slave replication, and they're the recom-

mended MongoDB replication strategy. We'll start by configuring a sample replica set.

I'll then describe how replication actually works, as this knowledge is incredibly

important for diagnosing production issues. We'll end by discussing advanced config-

uration details, failover and recovery, and best deployment practices.

8.2.1

Setup

The minimum recommended replica set configuration consists of three nodes. Two of

these nodes serve as first-class, persistent mongod instances. Either can act as the rep-

lica set primary, and both have a full copy of the data. The third node in the set is an

arbiter , which doesn't replicate data, but merely acts as a kind of neutral observer. As

the name suggests, the arbiter arbitrates: when failover is required, the arbiter helps

to elect a new primary node. You can see an illustration of the replica set you're about

to set up in figure 8.1.

Start by creating a data directory for each replica set member:

mkdir /data/node1

mkdir /data/node2

mkdir /data/arbiter

Next, start each member as a separate mongod . Since you'll be running each process

on the same machine, it's probably easiest to start each mongod in a separate terminal

window:

mongod --replSet myapp --dbpath /data/node1 --port 40000

mongod --replSet myapp --dbpath /data/node2 --port 40001

mongod --replSet myapp --dbpath /data/arbiter --port 40002

Search WWH ::

Custom Search

Home