Databases Reference
In-Depth Information
This is a tricky problem that is often application-dependent, so the drivers punt on the
issue. You must catch whatever exception is thrown on network errors (you should be
able to find info about how to do this in your driver's documentation). Handle the
exception and then figure out on a request-by-request basis: do you want to resend the
message? Do you need to check state on the database first? Can you just give up, or do
you need to keep retrying?
Tip #20: Handle replica set failure and failover
Your application should be able to handle all of the exciting failure scenarios that could
occur with a replica set.
Suppose your application throws a “not master” error. There are a couple possible
causes for this error: your set might be failing over to a new primary and you have to
handle the time during the primary's election gracefully. The time it takes for an election
varies: it's usually a few seconds, but if you're unlucky it could be 30 seconds or more.
If you're on the wrong side of a network partition, you might not be able to see a master
for hours.
Not being able to see a master at all is an important case to handle: can your application
drop into read-only mode if this happens? Your application should be able to handle
being read-only for short periods (during a master election) and long periods (when a
majority is down or partitioned).
Regardless of whether there's a master, you should be able to continue sending reads
to whichever members of the set you can reach.
Members may briefly go through an unreadable “recovering” phase during elections:
members in this state will throw errors about not being masters or secondaries if your
driver tries to read from them, and may be in this state so fleetingly that these errors
slip in between the pings drivers send to the database.
 
Search WWH ::




Custom Search