Replication - MongoDB in Action

Database Reference

In-Depth Information

ascertain its own health. When you run rs.status() , you see the timestamp of each

node's last heartbeat along with its state of health ( 1 means healthy and 0 means

unresponsive).

As long as every node remains healthy and responsive, the replica set will hum

along its merry way. But if any node becomes unresponsive, action may be taken. What

every replica set wants is to ensure that exactly one primary node exists at all times.

But this is possible only when a majority of nodes is visible. For example, look back at

the replica set you built in the previous section. If you kill the secondary, then a major-

ity of nodes still exists, so the replica set doesn't change state but simply waits for the

secondary to come back online. If you kill the primary, then a majority still exists, but

there's no primary. Therefore, the secondary is automatically promoted to primary. If

more than one secondary happens to exist, then the most current secondary will be

the one elected.

But there are other possible scenarios. Imagine that both the secondary and the

arbiter are killed. Now the primary remains, but there's no majority—only one of

the three original nodes remains healthy. In this case, you'll see a message like this

in the primary's log:

Tue Feb 1 11:26:38 [rs Manager] replSet can't see a majority of the set,

relinquishing primary

Tue Feb 1 11:26:38 [rs Manager] replSet relinquishing primary state

Tue Feb 1 11:26:38 [rs Manager] replSet SECONDARY

With no majority, the primary actually demotes itself to a secondary. This may seem

puzzling, but think about what might happen if this node were allowed to remain pri-

mary. If the heartbeats fail due to some kind of network partition, then the other

nodes will still be online. If the arbiter and secondary are still up and able to see each

other, then according to the rule of the majority, the remaining secondary will

become a primary. If the original primary doesn't step down, then you're suddenly in

an untenable situation: a replica set with two primary nodes. If the application contin-

ues to run, then it might write to and read from two different primaries, a sure recipe

for inconsistency and truly bizarre application behavior. Therefore, when the primary

can't see a majority, it must step down.

C OMMIT AND ROLLBACK

One final important point to understand about replica sets is the concept of a commit .

In essence, you can write to a primary node all day long, but those writes won't be con-

sidered committed until they've been replicated to a majority of nodes. What do I

mean by committed here? The idea can best be explained by example. Imagine again

the replica set you built in the previous section. Suppose you issue a series of writes to

the primary that don't get replicated to the secondary for some reason (connectivity

issues, secondary is down for backup, secondary is lagging, and so forth). Now sup-

pose further that the secondary is suddenly promoted to primary. You write to the new

primary, and eventually the old primary comes back online and tries to replicate from

Search WWH ::

Custom Search

Home