Scaling MySQL - High Performance MySQL

Databases Reference

In-Depth Information

numbers, especially if you move them into different roles as you add more servers,

or when you recover from failures.

Create a table in the global node

You can create a table with an AUTO_INCREMENT column in your global database

node, and applications can use this to generate unique numbers.

Use memcached

There's an incr() function in the memcached API that can increment a number

atomically and return the result. You can use Redis, too.

Allocate numbers in batches

The application can request a batch of numbers from a global node, use all the

numbers, and then request more.

Use a combination of values

You can use a combination of values, such as the shard ID and an incrementing

number, to make each server's values unique. See the discussion of this technique

in the previous section.

Use GUID values

You can generate globally unique values with the UUID() function. Beware, though:

this function does not replicate correctly with statement-based replication, al-

though it works fine if your application selects the value into its own memory and

then uses it as a literal in statements. GUID values are large and nonsequential, so

they don't make good primary keys for InnoDB tables. See “Inserting rows in pri-

mary key order with InnoDB” on page 173 for more on this. There's also a

UUID_SHORT() function in MySQL 5.1 and newer versions, which has some nice

properties such as being sequential and only 64 bits instead of 128.

If you use a global allocator to generate values, be careful that the single point of con-

tention doesn't create a bottleneck for your application.

Although the memcached approach can be very fast (tens of thousands of values per

second), it isn't persistent. Each time you restart the memcached service, you'll need to

initialize the value in the cache. This could require you to find the maximum value

that's in use across all shards, which might be very slow and difficult to do atomically.

Tools for sharding

One of the first things you'll have to do when designing a sharded application is write

code for querying multiple data sources.

It's generally a poor design to expose the multiple data sources to the application

without any abstraction, because it can add a lot of code complexity. It's better to hide

the data sources behind an abstraction layer. This layer might handle the following

tasks:

• Connecting to the correct shard and querying it

• Distributed consistency checks

Search WWH ::

Custom Search

Home