Database Reference
In-Depth Information
mysql> SELECT COUNT(*) FROM kjv WHERE MATCH(vtext) AGAINST('God');
+----------+
| COUNT(*) |
+----------+
| 0 |
+----------+
mysql> SELECT COUNT(*) FROM kjv WHERE MATCH(vtext) AGAINST('sin');
+----------+
| COUNT(*) |
+----------+
| 0 |
+----------+
One property of the indexing engine is that it ignores words that are “too common”
(that is, words that occur in more than half the rows). This eliminates words such as
“the” or “and” from the index, but that's not what is going on here. You can verify that
by counting the total number of rows, and by using SQL pattern matches to count the
number of rows containing each word (see Recipe 8.1 regarding the use of COUNT() to
produce multiple counts from the same set of values):
mysql> SELECT COUNT(*) AS 'total verses',
-> COUNT(IF(vtext LIKE '%God%',1,NULL)) AS 'verses containing "God"',
-> COUNT(IF(vtext LIKE '%sin%',1,NULL)) AS 'verses containing "sin"'
-> FROM kjv;
+--------------+-------------------------+-------------------------+
| total verses | verses containing "God" | verses containing "sin" |
+--------------+-------------------------+-------------------------+
| 31102 | 4117 | 1292 |
+--------------+-------------------------+-------------------------+
Neither word is present in more than half the verses, so sheer frequency of occurrence
doesn't account for the failure of a full-text search to find them. What's really happening
is that, by default, the MyISAM full-text indexing engine doesn't include words less than
four characters long. The minimum word length is a configurable parameter; to change
it, set the ft_min_word_len system variable. For example, to tell the indexing engine to
include words as short as three characters, add a line to the [mysqld] group of the /etc/
my.cnf file (or whatever option file you use for server settings):
[mysqld]
ft_min_word_len=3
After making this change, restart the server. Next, rebuild the FULLTEXT index to take
advantage of the new setting:
mysql> REPAIR TABLE kjv QUICK;
(You should also use REPAIR TABLE to rebuild the indexes for all other MyISAM tables
that have FULLTEXT indexes.)
Finally, try the new index to verify that it includes shorter words:
Search WWH ::




Custom Search