Query Performance Optimization - High Performance MySQL

Databases Reference

In-Depth Information

Case Studies

Sometimes it's not about query optimization, schema optimization, index optimiza-

tion, or application design optimization—it's about all of these practices put together.

The case studies in this section illustrate how to approach some design challenges that

frequently cause problems for users. You might also be interested in Bill Karwin's book

SQL Antipatterns (Pragmatic Bookshelf). It has recipes for solving particular problems

with SQL that often trap the unwary programmer into a poor solution.

Building a Queue Table in MySQL

Building a queue in MySQL is tricky, and most designs we've seen don't work well

when the system experiences high traffic and lots of concurrency. The typical pattern

is to have a table that contains several types of rows: rows that haven't been processed,

rows in process, and finished rows. One or more worker processes look for unprocessed

rows, update them to “claim” them, and then perform the work and update them to

mark them as finished. Common examples include emails that are ready to send, orders

to process, comments to moderate, and so on.

There are two broad reasons why this doesn't work well. First, the table tends to grow

very large, and searching for the unprocessed rows becomes slow when the table is large

and the indexes are many levels deep. You can solve this by splitting the queue into

two tables and moving the completed rows to the archive or history table, which helps

keep the queue table small.

The second reason is that the process of finding work to do is usually implemented

with polling and locking. Polling creates load on the server, and locking creates con-

tention and serialization between worker processes. We'll see later, in Chapter 11 , why

that limits scalability.

Polling might actually be okay, but if it's not, you can use notifications to tell workers

that there's work to do. One technique is to use the SLEEP() function with a very long

timeout and an indicative comment, such as the following:

SELECT /* waiting on unsent_emails */ SLEEP(10000);

This will cause the thread to block until one of two things happens: it times out after

10,000 seconds, or another thread issues KILL QUERY and terminates it. So, after insert-

ing a batch of queries into the table, you can look at SHOW PROCESSLIST , find threads

that are running queries with the magical comment, and kill the queries. You can also

implement a form of notification with the GET_LOCK() and RELEASE_LOCK() functions,

or you can do it outside of the database, with a messaging service.

The final problem is how workers should claim rows so that they don't get processed

multiple times. We often see this implemented with SELECT FOR UPDATE . This is usually

a huge scalability bottleneck and causes a lot of pileups as transactions block on each

other and wait.

Search WWH ::

Custom Search

Home