Database Reference
In-Depth Information
The allotted hardware can't process the given workload. As an example, I men-
tioned working sets in the previous chapter. If your working data set is much
larger than the available RAM , then sending random reads to the secondaries is
still likely to result in excessive disk access, and thus slow queries.
The ratio of writes to reads exceeds 50%. This is an admittedly arbitrary ratio,
but it's a reasonable place to start. The issue here is that every write to the pri-
mary must eventually be written to all the secondaries as well. Therefore direct-
ing reads to secondaries that are already processing a lot of writes can
sometimes slow the replication process and may not result in increased read
throughput.
The application requires consistent reads. Secondary nodes replicate asynchro-
nously and therefore aren't guaranteed to reflect the latest writes to the pri-
mary node. In pathological cases, secondaries can run hours behind.
So you can balance read load with replication, but only in special cases. If you need to
scale and any of the preceding conditions apply, then you'll need a different strategy,
involving sharding, augmented hardware, or some combination of the two.
8.2
Replica sets
Replica sets are a refinement on master-slave replication, and they're the recom-
mended MongoDB replication strategy. We'll start by configuring a sample replica set.
I'll then describe how replication actually works, as this knowledge is incredibly
important for diagnosing production issues. We'll end by discussing advanced config-
uration details, failover and recovery, and best deployment practices.
8.2.1
Setup
The minimum recommended replica set configuration consists of three nodes. Two of
these nodes serve as first-class, persistent mongod instances. Either can act as the rep-
lica set primary, and both have a full copy of the data. The third node in the set is an
arbiter , which doesn't replicate data, but merely acts as a kind of neutral observer. As
the name suggests, the arbiter arbitrates: when failover is required, the arbiter helps
to elect a new primary node. You can see an illustration of the replica set you're about
to set up in figure 8.1.
Start by creating a data directory for each replica set member:
mkdir /data/node1
mkdir /data/node2
mkdir /data/arbiter
Next, start each member as a separate mongod . Since you'll be running each process
on the same machine, it's probably easiest to start each mongod in a separate terminal
window:
mongod --replSet myapp --dbpath /data/node1 --port 40000
mongod --replSet myapp --dbpath /data/node2 --port 40001
mongod --replSet myapp --dbpath /data/arbiter --port 40002
Search WWH ::




Custom Search