Advanced MySQL Features - High Performance MySQL

Databases Reference

In-Depth Information

MySQL, you might want “mysql” to be a stopword, because it's too common to be

helpful.

You can often improve performance by skipping short words. The length is configu-

rable with the ft_min_word_len parameter. Increasing the default value will skip more

words, making your index smaller and faster, but less accurate. Also bear in mind that

for special purposes, you might need very short words. For example, a full-text search

of consumer electronics products for the query “cd player” is likely to produce lots of

irrelevant results unless short words are allowed in the index. A user searching for “cd

player” won't want to see MP3 and DVD players in the results, but if the minimum

word length is the default four characters, the search will actually be for just “player,”

so all types of players will be returned.

The stopword list and the minimum word length can improve search speeds by keeping

some words out of the index, but the search quality can suffer as a result. The right

balance is application-dependent. If you need good performance and good-quality

results, you'll have to customize both parameters for your application. It's a good idea

to build in some logging and then investigate common searches, uncommon searches,

searches that don't return results, and searches that return a lot of results. You can gain

insight about your users and your searchable content this way, and then use that insight

to improve performance and the quality of your search results.

Be aware that if you change the minimum word length, you'll have to

rebuild the index with OPTIMIZE TABLE for the change to take effect. A

related parameter is ft_max_word_len , which is mainly a safeguard to

avoid indexing very long keywords.

If you're importing a lot of data into a server and you want full-text indexing on some

columns, disable the full-text indexes before the import with DISABLE KEYS and enable

them afterward with ENABLE KEYS . This is usually much faster because of the high cost

of updating the index for each row inserted, and you'll get a defragmented index as a

bonus.

For large datasets, you might need to manually partition the data across many nodes

and search them in parallel. This is a difficult task, and you might be better off using

an external full-text search engine, such as Lucene or Sphinx. Our experience shows

they can have orders of magnitude better performance.

Distributed (XA) Transactions

Whereas storage engine (see “Transactions” on page 6 ) transactions give ACID prop-

erties inside the storage engine, a distributed (XA) transaction is a higher-level trans-

action that can extend some ACID properties outside the storage engine—and even

Search WWH ::

Custom Search

Home