Databases Reference
In-Depth Information
queries per second from a single correspondent on a gigabit network, so running mul-
tiple queries isn't necessarily such a bad thing.
Connection response is still slow compared to the number of rows MySQL can traverse
per second internally, though, which is counted in millions per second for in-memory
data. All else being equal, it's still a good idea to use as few queries as possible, but
sometimes you can make a query more efficient by decomposing it and executing a few
simple queries instead of one complex one. Don't be afraid to do this; weigh the costs,
and go with the strategy that causes less work. We show some examples of this tech-
nique a little later in the chapter.
That said, using too many queries is a common mistake in application design. For
example, some applications perform 10 single-row queries to retrieve data from a table
when they could use a single 10-row query. We've even seen applications that retrieve
each column individually, querying each row many times!
Chopping Up a Query
Another way to slice up a query is to divide and conquer, keeping it essentially the same
but running it in smaller “chunks” that affect fewer rows each time.
Purging old data is a great example. Periodic purge jobs might need to remove quite a
bit of data, and doing this in one massive query could lock a lot of rows for a long time,
fill up transaction logs, hog resources, and block small queries that shouldn't be inter-
rupted. Chopping up the DELETE statement and using medium-size queries can improve
performance considerably, and reduce replication lag when a query is replicated. For
example, instead of running this monolithic query:
mysql> DELETE FROM messages WHERE created < DATE_SUB(NOW(),INTERVAL 3 MONTH);
you could do something like the following pseudocode:
rows_affected = 0
do {
rows_affected = do_query(
"DELETE FROM messages WHERE created < DATE_SUB(NOW(),INTERVAL 3 MONTH)
LIMIT 10000")
} while rows_affected > 0
Deleting 10,000 rows at a time is typically a large enough task to make each query
efficient, and a short enough task to minimize the impact on the server 4 (transactional
storage engines might benefit from smaller transactions). It might also be a good idea
to add some sleep time between the DELETE statements to spread the load over time and
reduce the amount of time locks are held.
4. Percona Toolkit's pt-archiver tool makes these types of jobs easy and safe.
 
Search WWH ::




Custom Search