Indexing for High Performance - High Performance MySQL

Databases Reference

In-Depth Information

• MySQL can't use hash indexes for sorting because they don't store rows in sorted

order.

• Hash indexes don't support partial key matching, because they compute the hash

from the entire indexed value. That is, if you have an index on (A,B) and your

query's WHERE clause refers only to A , the index won't help.

• Hash indexes support only equality comparisons that use the = , IN() , and <=>

operators (note that <> and <=> are not the same operator). They can't speed up

range queries, such as WHERE price > 100 .

• Accessing data in a hash index is very quick, unless there are many collisions (mul-

tiple values with the same hash). When there are collisions, the storage engine must

follow each row pointer in the linked list and compare their values to the lookup

value to find the right row(s).

• Some index maintenance operations can be slow if there are many hash collisions.

For example, if you create a hash index on a column with a very low selectivity

(many hash collisions) and then delete a row from the table, finding the pointer

from the index to that row might be expensive. The storage engine will have to

examine each row in that hash key's linked list to find and remove the reference to

the one row you deleted.

These limitations make hash indexes useful only in special cases. However, when they

match the application's needs, they can improve performance dramatically. An exam-

ple is in data-warehousing applications where a classic “star” schema requires many

joins to lookup tables. Hash indexes are exactly what a lookup table requires.

In addition to the Memory storage engine's explicit hash indexes, the NDB Cluster

storage engine supports unique hash indexes. Their functionality is specific to the NDB

Cluster storage engine, which we don't cover in this topic.

The InnoDB storage engine has a special feature called adaptive hash indexes . When

InnoDB notices that some index values are being accessed very frequently, it builds a

hash index for them in memory on top of B-Tree indexes. This gives its B-Tree indexes

some properties of hash indexes, such as very fast hashed lookups. This process is

completely automatic, and you can't control or configure it, although you can disable

the adaptive hash index altogether.

If your storage engine doesn't support hash indexes, you

can emulate them yourself in a manner similar to that InnoDB uses. This will give you

access to some of the desirable properties of hash indexes, such as a very small index

size for very long keys.

The idea is simple: create a pseudohash index on top of a standard B-Tree index. It will

not be exactly the same thing as a real hash index, because it will still use the B-Tree

index for lookups. However, it will use the keys' hash values for lookups, instead of

the keys themselves. All you need to do is specify the hash function manually in the

query's WHERE clause.

Building your own hash indexes.

Search WWH ::

Custom Search

Home