Selecting Indexes - Physical Database Design

Databases Reference

In-Depth Information

4.4 Join Index Selection

Up to this point, index selection has assumed simple queries based only on selection cri-

teria, e.g.

SELECT lastName

FROM employee

WHERE empNo = 45678;

However, most queries involve at least one or more joins, and in-dexes are com-

monly used to speed up the join operations. This sec-tion discusses the index selec-

tion tradeoffs involved in join opera-tions using our simple mathematical model in

Appendix A.

The basic join implementations we will consider are:

•

Nested-loop join

-Block n ed loop join

-

Indexed nested-loop join

•

Sort-merge join

•

Hash join

All these basic join implementations are supported by DB2, SQL Server, and Oracle, as

well as many other systems.

Our analysis is based on the assumption that if the join is between a table whose

foreign key matches another table's primary key, then m represents the number of rows

in the table with the primary key and n represents the number of rows in the table with

the foreign key. If the join is not between two such tables, then the designations of

which table has m rows and which one has n rows is arbitrary.

4.4.1 Nested-loop Join

The nested-loop strategy is the basic method of join. The outer loop is a sequential scan

of the first table R, and for each row in table R scanned, the inner loop is executed, a

sequential scan of the second table S. Our basic parameters are the number of rows, m

and n , in the two tables to be joined and the physical order of rows in each table. The

complexity is O ( mn ) because of the double loop. We assume that each table is stored in

physically contiguous disk space to facilitate table scans. The time cost of executing this

strategy also depends on which table we select for the outer and inner loops.

Search WWH ::

Custom Search

Home