Databases Reference
In-Depth Information
An example of when this approach works well is for URL lookups. URLs generally
cause B-Tree indexes to become huge, because they're very long. You'd normally query
a table of URLs like this:
mysql> SELECT id FROM url WHERE url="http://www.mysql.com";
But if you remove the index on the url column and add an indexed url_crc column to
the table, you can use a query like this:
mysql> SELECT id FROM url WHERE url="http://www.mysql.com"
-> AND url_crc=CRC32("http://www.mysql.com");
This works well because the MySQL query optimizer notices there's a small, highly
selective index on the url_crc column and does an index lookup for entries with that
value (1560514994, in this case). Even if several rows have the same url_crc value, it's
very easy to find these rows with a fast integer comparison and then examine them to
find the one that matches the full URL exactly. The alternative is to index the full URL
as a string, which is much slower.
One drawback to this approach is the need to maintain the hash values. You can do
this manually or, in MySQL 5.0 and newer, you can use triggers. The following example
shows how triggers can help maintain the url_crc column when you insert and update
values. First, we create the table:
CREATE TABLE pseudohash (
id int unsigned NOT NULL auto_increment,
url varchar(255) NOT NULL,
url_crc int unsigned NOT NULL DEFAULT 0,
PRIMARY KEY(id)
);
Now we create the triggers. We change the statement delimiter temporarily, so we can
use a semicolon as a delimiter for the trigger:
DELIMITER //
CREATE TRIGGER pseudohash_crc_ins BEFORE INSERT ON pseudohash FOR EACH ROW BEGIN
SET NEW.url_crc=crc32(NEW.url);
END;
//
CREATE TRIGGER pseudohash_crc_upd BEFORE UPDATE ON pseudohash FOR EACH ROW BEGIN
SET NEW.url_crc=crc32(NEW.url);
END;
//
DELIMITER ;
 
Search WWH ::




Custom Search