Database Reference
In-Depth Information
joins . It is more difficult to design good indexes for a join than for a single-table
SELECT because the join method and the table access order have a great impact
on index requirements. The methods discussed in previous chapters, however,
are still applicable to joins but with additional considerations.
In a join SELECT there are two kinds of predicates, local predicates that
access a single table and join predicates that define the connection between
tables. Most optimizers choose the join method and table access order by esti-
mating the local response time of numerous alternatives. The most common join
method is the nested-loop join (SQL Server: loop join ), although merge scan join
(Oracle: sort-merge join ;SQLServer: merge join )and hash join are becoming
more popular; this is because sequential read is much faster than it used to be.
In a nested-loop join, the DBMS first finds a row in the outer table that satisfies
the local predicates referring to that table. Then it looks for the related rows
in the next table, the inner table , and checks which of these satisfy their local
predicates—and so on.
A two-table join could be replaced by two single-table cursors and two nested
loops coded in the program. It is then the programmer who makes the decision
about the table access order—which table is accessed in the outer loop. When a
join is used, the optimizer makes that decision, although this may be overridden
by hints or other tricks.
Join predicates are most often based on primary key = foreign key . Assum-
ing the correct foreign key indexes (starting with the foreign key columns) have
been created and the result table is not exceptionally large, a nested loop is likely
to be the fastest method, at least if all local predicates refer to one table.
If a nested-loop join is not appropriate, a merge scan or hash join may be
faster; with the former, one or more tables are sorted into a consistent order,
if necessary (after local predicates have been applied), and the qualifying rows
in the tables or work files are merged; the hash join is basically a merge scan
with hashing instead of sorting—more details later. With these methods, no table
page is accessed more than once. Oracle, SQL Server, and DB2 for LUW tend
to choose hash join instead of merge scan.
DB2 for z/OS does not use a hash join; in some less common situations a
hybrid join — essentially a nested loop with list prefetch—may be preferred.
Note
ž In this chapter we will initially concentrate on the nested-loop join and
then compare this with the merge scan/hash join.
ž Where data is required from more than one table, the column names in
this chapter are prefixed with the table names to show from which table
or index the columns are to be found.
TWO SIMPLE JOINS
We will first analyze two simple join examples to show the processes involved
in a nested-loop join and how we can calculate the cost of the joins.
Search WWH ::




Custom Search