Replication - MongoDB in Action

Database Reference

In-Depth Information

8.4.3

Read scaling

Replicated databases are great for read scaling. If a single server can't handle the

application's read load, then you have the option to route queries to more than one

replica. Most of the drivers have built-in support for sending queries to secondary

nodes. With the Ruby driver, this is provided as an option on the ReplSetConnection

constructor:

Mongo::ReplSetConnection.new(['arete', 40000],

['arete', 40001], :read => :secondary )

When the :read argument is set to :secondary , the connection object will choose a

random, nearby secondary to read from.

Other drivers can be configured to read from secondaries by setting a slaveOk

option. When the Java driver is connected to a replica set, setting slaveOk to true will

enable secondary load balancing on a per-thread basis. The load balancing implemen-

tations found in the drivers are designed to be generally applicable, so they may not

work for all apps. When that's the case, users frequently customize their own. As usual,

consult your driver's documentation for specifics.

Many MongoDB users scale with replication in production. But there are three

cases where this sort of scaling won't be sufficient. The first concerns the number of

servers needed. As of MongoDB v2.0, replica sets support a maximum of 12 members,

7 of which can vote. If you need even more replicas for scaling, you can use master-slave

replication. But if you don't want to sacrifice automated failover and you need to scale

beyond the replica set maximum, then you'll need to migrate to a sharded cluster.

The second case involves applications with a high write load. As mentioned at the

beginning of the chapter, secondaries must keep up with this write load. Sending

reads to write-laden secondaries may inhibit replication.

A third situation that replica scaling can't handle is consistent reads. Because repli-

cation is asynchronous, replicas aren't always going to reflect the latest writes to the

primary. Therefore, if your application reads arbitrarily from secondaries, then the

picture presented to end users isn't always guaranteed to be fully consistent. For appli-

cations whose main purpose is to display content, this almost never presents a prob-

lem. But other apps, where users are actively manipulating data, will require

consistent reads. In these cases, you have two options. The first is to separate the parts

of the application that need consistent reads from the parts that don't. The former

can always be read from the primary, and the latter can be distributed to secondaries.

When this strategy is either too complicated or simply doesn't scale, sharding is the

way to go. 13

13

Note that to get consistent reads from a sharded cluster, you must always read from the primary nodes of each

shard, and you must issue safe writes.

Search WWH ::

Custom Search

Home