Database Reference
In-Depth Information
Listing 25-7. Outer nested loop join algorithm
for each row R1 in outer table
for each row R2 in inner table
if R1 joins with R2
return join (R1, R2)
else
return join (R1, NULL)
As you can see, the cost of the algorithm depends on the size of the inputs, and it is proportional to its
multiplication; that is, the size of the outer input multiplied by the size of the inner input. The cost grows quickly with
the size of the inputs; therefore, a nested loop join is efficient when at least one of the inputs is small.
A nested loop join does not require join keys to have an equality predicate. SQL Server evaluates a join predicate
between every row from both inputs. In fact, it does not require a join predicate at all. For example, the CROSS JOIN
logical operator would lead to a nested loop physical join when every row from both inputs are joined together.
Merge Join
The merge join works with two sorted inputs. It compares two rows, one at time, and returns their join to the client if
they are equal. Otherwise, it discards the lesser value and moves on to the next row in the input.
Contrary to nested loop joins, a merge join requires at least one equality predicate on the join keys. Listing 25-8
shows the algorithm for the inner merge join.
Listing 25-8. Inner merge join algorithm
/* Prerequirements: Inputs I1 and I2 are sorted */
get first row R1 from input I1
get first row R2 from input I2
while not end of either input
begin
if R1 joins with R2
begin
return join (R1, R2)
get next row R2 from I2
end
else if R1 < R2
get next row R1 from I1
else /* R1 > R2 */
get next row R2 from I2
end
The cost of the merge join algorithm is proportional to the sum of the sizes of both inputs, which makes it more
efficient on large inputs as compared to a nested loop join. However, a merge join requires both inputs to be sorted,
which is often the case when inputs are indexed on the join key column.
In some cases, however, SQL Server may decide to sort input(s) using the Sort operator before a merge join.
The cost of the sort obviously needs to be factored together with the cost of the join operator during its analysis.
Tip
in some cases, you can eliminate the Sort operator by creating corresponding indexes that pre-sort the data.
 
 
Search WWH ::




Custom Search