Databases Reference
In-Depth Information
with the most significant digits first, as in XY142536, and place them in a column that
is included in the full-text index. Now if you want to limit searches to, for example, a
rectangle bounded by X between 100 and 199 and Y between 400 and 499, you can
add “+XY14*” to the search query. This can be faster than filtering with a WHERE clause.
A technique that sometimes works well with full-text indexes, especially for paginated
displays, is to select a list of primary keys by a full-text query and cache the results.
When the application is ready to render some results, it can issue another query that
fetches the desired rows by their IDs. This second query can include more complicated
criteria or joins that need to use other indexes to work well.
Even though only MyISAM supports full-text indexes, if you need to use InnoDB or
another storage engine instead, you can replicate your tables to a server that uses the
MyISAM storage engine, then use the replica to serve full-text queries. If you don't want
to serve some queries from a different server, you can partition a table vertically by
breaking it into two, keeping textual columns separate from the rest of the data.
You can also duplicate some columns into a table that's full-text indexed. You can see
this strategy in action in the sakila.film_text table, which is maintained with triggers.
Yet another alternative is to use an external full-text engine, such as Lucene or Sphinx.
You can read more about Sphinx in Appendix F .
GROUP BY queries with full-text searches can be performance killers, again because the
full-text query typically finds a lot of matches; these cause random disk I/O, followed
by a temporary table or filesort for the grouping. Because such queries are often just
looking for the top items per group, a good optimization is to sample the results instead
of trying for complete accuracy. For example, select the first 1,000 rows into a tempo-
rary table, then return the top result per group from that.
Full-Text Configuration and Optimization
Regular maintenance of your full-text indexes is one of the most important things you
can do to enhance performance. The double-B-Tree structure of full-text indexes, com-
bined with the large number of keywords in typical documents, means they suffer from
fragmentation much more than normal indexes. You might need to use OPTIMIZE
TABLE frequently to defragment the indexes. If your server is I/O-bound, it might be
much faster to just drop and recreate the full-text indexes periodically.
A server that must perform well for full-text searches needs key buffers that are large
enough to hold the full-text indexes, because they work much better when they're in
memory. You can use dedicated key buffers to make sure other indexes don't flush your
full-text indexes from the key buffer. See Chapter 8 for more details on MyISAM key
buffers.
It's also important to provide a good stopword list. The defaults will work well for
English prose, but they might not be good for other languages or for specialized texts,
such as technical documents. For example, if you're indexing a document about
 
Search WWH ::




Custom Search