Databases Reference
In-Depth Information
A Boolean full-text search doesn't actually require a full-text index to work, although
it does require the MyISAM storage engine. It will use a full-text index if there is one,
but if there isn't, it will just scan the entire table. You can even use a Boolean full-text
search on columns from multiple tables, such as the results of a join. In all of these
cases, though, it will be slow.
Full-Text Changes in MySQL 5.1
MySQL 5.1 introduced quite a few changes related to full-text searching. These include
performance improvements and the ability to build pluggable parsers that can enhance
the built-in capabilities. For example, plugin can change the way indexing works. They
can split text into words more flexibly than the defaults (you can specify that “C++”
is a single word, for example), do preprocessing, index different content types (such as
PDF), or do custom word stemming. The plugins can also influence the way searches
work—for example, by stemming search terms.
Full-Text Tradeoffs and Workarounds
MySQL's implementation of full-text searching has several design limitations. These
can be contraindications for specific purposes, but there are also many ways to work
around them.
For example, there is only one form of relevance ranking in MySQL's full-text indexing:
frequency. The index doesn't record the indexed word's position in the string, so
proximity doesn't contribute to relevance. Although that's fine for many purposes—
especially for small amounts of data—it might not be what you need, and MySQL's
full-text indexing doesn't give you the flexibility to choose a different ranking algo-
rithm. (It doesn't even store the data you'd need for proximity-based ranking.)
Size is another issue. MySQL's full-text indexing performs well when the index fits in
memory, but if the index is not in memory it can be very slow, especially when the fields
are large. When you're using phrase searches, the data and indexes must both fit in
memory for good performance. Compared to other index types, it can be very expensive
to insert, update, or delete rows in a full-text index:
• Modifying a piece of text with 100 words requires not 1 but up to 100 index
operations.
• The field length doesn't usually affect other index types much, but with full-text
indexing, text with 3 words and text with 10,000 words will have performance
profiles that differ by orders of magnitude.
• Full-text search indexes are also much more prone to fragmentation, and you might
find you need to use OPTIMIZE TABLE more frequently.
Full-text indexes affect how the server optimizes queries, too. Index choice, WHERE clau-
ses, and ORDER BY all work differently from how you might expect:
 
Search WWH ::




Custom Search