Using Sphinx with MySQL - High Performance MySQL

Databases Reference

In-Depth Information

uncompress the rows transmitted over the network, respectively but the overall index-

ing time could be up to 20-30% less because of greatly reduced network traffic.

Search clusters can suffer from occasional overload, too, so Sphinx provides a few ways

to help avoid searchd going off on a spin.

First, a max_children option simply limits the total number of concurrently running

queries and tells clients to retry when that limit is reached.

Then there are query-level limits. You can specify that query processing stop either at

a given threshold of matches found or a given threshold of elapsed time, using the

SetLimits() and SetMaxQueryTime() API calls, respectively. This is done on a per-query

basis, so you can ensure that more important queries always complete fully.

Finally, periodic indexer runs can cause bursts of additional I/O that will in turn cause

intermittent searchd slowdowns. To prevent that, options that limit indexer disk I/O

exist. max_iops enforces a minimal delay between I/O operations that ensures that no

more than max_iops disk operations per second will be performed. But even a single

operation could be too much; consider a 100 MB read() call as an example. The

max_iosize option takes cares of that, guaranteeing that the length of every disk read

or write will be under a given boundary. Larger operations are automatically split into

smaller ones, and these smaller ones are then controlled by max_iops settings.

Practical Implementation Examples

Each of the features we've described can be found successfully deployed in production.

The following sections review several of these real-world Sphinx deployments, briefly

describing the sites and some implementation details.

Full-Text Searching on Mininova.org

A popular torrent search engine, Mininova ( http://www.mininova.org ) provides a clear

example of how to optimize “just” full-text searching. Sphinx replaced several MySQL

replicas using MySQL built-in full-text indexes, which were unable to handle the load.

After the replacement, the search servers were underloaded; the current load average

is now in the 0.3-0.4 range.

Here are the database size and load numbers:

• The site has a small database, with about 300,000-500,000 records and about

300-500 MB of index.

• The site load is quite high: about 8-10 million searches per day at the time of this

writing.

The data mostly consists of user-supplied filenames, frequently without proper punc-

tuation. For this reason, prefix indexing is used instead of whole-word indexing. The

Search WWH ::

Custom Search

Home