Databases Reference
In-Depth Information
often rebalance the load by moving entire shards, which is easier than moving part of
a shard (and more efficient, in terms of cost per row of data).
One strategy that works well is to use a dynamic sharding strategy and assign new data
to shards randomly. When a shard gets full enough, you can set a flag that tells the
application not to give it any new data. You can then flip the flag back if you want more
data on that shard in the future.
Suppose you install a new MySQL node and place 100 shards on it. To begin, you set
their flags to 1 , so the application knows they're ready for new data. Once they each
have enough data (10,000 users each, for example), you set their flags to 0 . Then, if the
node becomes underloaded after a while because of abandoned accounts, you can re-
open some of the shards and add new users to them.
If you upgrade the application and add features that make each shard's query load
higher, or if you just miscalculated the load, you can move some of the shards to new
nodes to ease the load. The drawback is that an entire shard might be read-only or
offline while you do this. It's up to you and your users to decide whether that's
acceptable.
Another tactic we use a lot is to set up two replicas of a shard, each with a complete
copy of the shard's data. We then make each replica responsible for half of the data,
and stop sending queries to the master completely. Each replica contains some data
it doesn't use; we set up a background job with a tool such as Percona Toolkit's
pt-archiver to remove the unwanted data. This is simple and requires practically zero
downtime.
Generating globally unique IDs
When you convert a system to use a sharded data store, you frequently need to generate
globally unique IDs on many machines. A monolithic data store often uses AUTO_
INCREMENT columns for this purpose, but that doesn't tend to work well across many
servers. There are several ways to solve this problem:
Use auto_increment_increment and auto_increment_offset
These two server settings instruct MySQL to increment AUTO_INCREMENT columns
by a desired value and to begin numbering from a desired offset. For example, in
the simplest case with two servers, you can configure the servers to increment by
two, set one server's offset to one, and set the other's to two (you can't set either
value to zero). Now one server's columns will always contain even numbers, and
the other's will always contain odd numbers. The setting applies to all tables in the
server.
Because of its simplicity and lack of dependency on a central node, this is a popular
way to generate values, but it requires you to be careful with your server configu-
rations. It's easy to accidentally configure servers so that they generate duplicate
 
Search WWH ::




Custom Search