Databases Reference
In-Depth Information
Smaller shards are easier to move around, too. This makes it easier to reallocate capacity
and rebalance the shards among the nodes. Moving a shard is generally not an efficient
process. You typically need to put the affected shard into read-only mode (a feature
you'll need to build into your application), extract the data, and move it to another
node. This usually involves using mysqldump to export the data and mysql to reload it.
If you're using Percona Server, you can use XtraBackup to move the files between
servers, which is much more efficient than dumping and reloading.
In addition to moving shards between nodes, you might need to think about moving
data between shards, preferably without interrupting service for the whole application.
If your shards are large, it will be harder to balance capacity by moving entire shards
around, so you'll probably need a way to move the individual bits of data (for example,
a single user) between shards. Moving data between shards is usually a lot more com-
plicated than just moving shards, so it's best not to do it if possible. That's one reason
we recommend keeping the shard size manageable.
The relative size of your shards depends on the application's needs. As a rough guide,
a “manageable size” for us is one that keeps tables small enough that we can perform
regular maintenance jobs, such as ALTER TABLE , CHECK TABLE , or OPTIMIZE TABLE , within
5 or 10 minutes.
If you make your shards too small, you might end up with too many tables, which can
cause problems with the filesystem or MySQL's internal structures. Small shards might
also increase the number of cross-shard queries you need to make.
Arranging shards on nodes
You'll need to decide how you want to arrange the shards on a node. Here are some
common methods:
• Use a single database per shard, and use the same name for each shard's database.
This method is typical when you want each shard to mirror the original applica-
tion's structure. It can work well when you're making many application instances,
each of which is aware of only one shard.
• Place tables from several shards into one database, and include the shard number
in each table's name (e.g., bookclub.comments_23 ). A single database can hold mul-
tiple shards in this configuration.
• Use a single database per shard, and include all the application's tables in the
database. Include the shard number in the database name but not the table name
(e.g., the tables might be named bookclub_23.comments , bookclub_23.users , and so
on). This is common when an application connects to a single database and doesn't
specify the database name in any of its queries. The advantage is that you don't
need to customize the queries per shard, and it can ease the transition to sharding
for an application that uses only one database.
 
Search WWH ::




Custom Search