Databases Reference
In-Depth Information
4.4 Join Index Selection
Up to this point, index selection has assumed simple queries based only on selection cri-
teria, e.g.
SELECT lastName
FROM employee
WHERE empNo = 45678;
However, most queries involve at least one or more joins, and in-dexes are com-
monly used to speed up the join operations. This sec-tion discusses the index selec-
tion tradeoffs involved in join opera-tions using our simple mathematical model in
Appendix A.
The basic join implementations we will consider are:
Nested-loop join
-Block n ed loop join
-
Indexed nested-loop join
Sort-merge join
Hash join
All these basic join implementations are supported by DB2, SQL Server, and Oracle, as
well as many other systems.
Our analysis is based on the assumption that if the join is between a table whose
foreign key matches another table's primary key, then m represents the number of rows
in the table with the primary key and n represents the number of rows in the table with
the foreign key. If the join is not between two such tables, then the designations of
which table has m rows and which one has n rows is arbitrary.
4.4.1 Nested-loop Join
The nested-loop strategy is the basic method of join. The outer loop is a sequential scan
of the first table R, and for each row in table R scanned, the inner loop is executed, a
sequential scan of the second table S. Our basic parameters are the number of rows, m
and n , in the two tables to be joined and the physical order of rows in each table. The
complexity is O ( mn ) because of the double loop. We assume that each table is stored in
physically contiguous disk space to facilitate table scans. The time cost of executing this
strategy also depends on which table we select for the outer and inner loops.
Search WWH ::




Custom Search