Databases Reference
In-Depth Information
MySQL 5.6 to be released and use InnoDB. Still, if using MyISAM is acceptable for
your application, read on.
A MyISAM full-text index operates on a full-text collection , which is made up of one or
more character columns from a single table. In effect, MySQL builds the index by
concatenating the columns in the collection and indexing them as one long string of
text.
A MyISAM full-text index is a special type of B-Tree index with two levels. The first
level holds keywords. Then, for each keyword, the second level holds a list of associated
document pointers that point to full-text collections that contain that keyword. The
index doesn't contain every word in the collection. It prunes it as follows:
• A list of stopwords weeds out “noise” words by preventing them from being in-
dexed. The stopword list is based on common English usage by default, but you
can use the ft_stopword_file option to replace it with a list from an external file.
• The index ignores words unless they're longer than ft_min_word_len characters
and shorter than ft_max_word_len characters.
Full-text indexes don't store information about which column in the collection a key-
word occurs in, so if you need to search on different combinations of columns, you will
need to create several indexes.
This also means you can't instruct a MATCH AGAINST clause to regard words from a par-
ticular column as more important than words from other columns. This is a common
requirement when building search engines for websites. For example, you might want
search results to appear first when the keywords appear in an item's title. If you need
this, you'll have to write more complicated queries. (We show an example later.)
Natural-Language Full-Text Searches
A natural-language search query determines each document's relevance to the query.
Relevance is based on the number of matched words and the frequency with which
they occur in the document. Words that are less common in the entire index make a
match more relevant. In contrast, extremely common words aren't worth searching for
at all. A natural-language full-text search excludes words that exist in more than 50%
of the rows in the table, even if they're not in the stopword list. 10
The syntax of a full-text search is a little different from other types of queries. You tell
MySQL to do full-text matching with MATCH AGAINST in the WHERE clause. Let's look at
an example. In the standard Sakila sample database, the film_text table has a full-text
index on the title and description columns:
10. A common mistake during testing is to put a few rows of sample data into a full-text search index, only
to find that no queries match. The problem is that every word appears in more than half the rows.
 
Search WWH ::




Custom Search