Using Sphinx with MySQL - High Performance MySQL

Databases Reference

In-Depth Information

Sphinx will perform both indexed searches and scans faster than MySQL. Sometimes

Sphinx actually performs a full scan faster than MySQL can perform an index read.

Finding the Top Results in Order

Web applications frequently need the top N results in order. As we discussed previously,

this is hard to optimize in MySQL 5.5 and older versions.

The worst case is when the WHERE condition finds many rows (let's say 1 million) and

the ORDER BY columns aren't indexed. MySQL uses the index to identify all the matching

rows, reads the records one by one into the sort buffer with semirandom disk reads,

sorts them all with a filesort, and then discards most of them. It will temporarily store

and process the entire result, ignoring the LIMIT clause and churning RAM. And if the

result set doesn't fit in the sort buffer, it will need to go to disk, causing even more

disk I/O.

This is an extreme case, and you might think it happens rarely in the real world, but in

fact the problems it illustrates happen often. MySQL's limitations on indexes for

sorting—using only the leftmost part of the index, not supporting loose index scans,

and allowing only a single range condition—mean many real-world queries can't ben-

efit from indexes. And even when they can, using semirandom disk I/O to retrieve rows

is a performance killer.

Paginated result sets, which usually require queries of the form SELECT ... LIMIT N,

M , are another performance problem in MySQL. They read N + M rows from disk, causing

a large amount of random I/O and wasting memory resources. Sphinx can accelerate

such queries significantly by eliminating the two biggest problems:

Memory usage

Sphinx's RAM usage is always strictly limited, and the limit is configurable. Sphinx

supports a result set offset and size similar to the MySQL LIMIT N, M syntax, but it

also has a max_matches option. This controls the equivalent of the “sort buffer” size,

on both a per-server and a per-query basis. Sphinx's RAM footprint is guaranteed

to be within the specified limits.

I/O

If attributes are stored in RAM, Sphinx does not do any I/O at all. And even if

attributes are stored on disk, Sphinx will perform sequential I/O to read them,

which is much faster than MySQL's semirandom retrieval of rows from disks.

You can sort search results by a combination of relevance (weight), attribute values,

and (when using GROUP BY ) aggregate function values. The sorting clause syntax is sim-

ilar to a SQL ORDER BY clause:

<?php

$cl = new SphinxClient ();

$cl->SetSortMode ( SPH_SORT_EXTENDED, 'price DESC, @weight ASC' );

// more code and Query() call here...

?>

Search WWH ::

Custom Search

Home