Scaling MySQL - High Performance MySQL

Databases Reference

In-Depth Information

allocation strategy instead of a dynamic one, because it's simpler. That said, most ap-

plications that use fixed allocation end up with a dynamic allocation strategy sooner

or later.

Dynamic allocation

The alternative to fixed allocation is a dynamic allocation strategy that you store sep-

arately, mapping each unit of data to a shard. An example is a two-column table of user

IDs and shard IDs:

CREATE TABLE user_to_shard (

user_id INT NOT NULL,

shard_id INT NOT NULL,

PRIMARY KEY (user_id)

);

The table itself is the partitioning function. Given a value for the partitioning key (the

user ID), you can find the shard ID. If the row doesn't exist, you can pick the desired

shard and add it to the table. You can also change it later—that's what makes this a

dynamic allocation strategy.

Dynamic allocation adds overhead to the partitioning function because it requires a

call to an external resource, such as a directory server (a data storage node that stores

the mapping). Such an architecture often needs more layers for efficiency. For example,

you can use a distributed caching system to store the directory server's data in memory,

because in practice it doesn't change all that much. Or—perhaps more commonly—

you can just add a shard_id column to the users table and store it there.

The biggest advantage of dynamic allocation is fine-grained control over where the data

is stored. This makes it easier to allocate data to the shards evenly and gives you a lot

of flexibility to accommodate changes you don't foresee.

A dynamic mapping also lets you build multiple levels of sharding strategies on top of

the simple key-to-shard mapping. For example, you can build a dual mapping that

assigns each unit of sharding to a group (e.g., a group of users in the book club), and

then keeps the groups together on a shard where possible. This lets you take advantage

of shard affinities, so you can avoid cross-shard queries.

If you use a dynamic allocation strategy, you can have imbalanced shards. This can be

useful when your servers aren't all equally powerful, or when you want to use some of

them for different purposes, such as archived data. If you also have the ability to reba-

lance shards at any time, you can maintain a one-to-one mapping of shards to nodes

without wasting capacity. Some people prefer the simplicity of one shard per node.

(But remember, there are advantages to keeping shards small.)

Dynamic allocation and smart use of shard affinities can prevent your cross-shard

queries from growing as you scale. Imagine a cross-shard query in a data store with

four nodes. In a fixed allocation, any given query might require touching all shards, but

a dynamic allocation strategy might let you run the same query on only three of the

Search WWH ::

Custom Search

Home