Analyzing Query Performance - SQL Server Query Performance Tuning

Database Reference

In-Depth Information

In many simple queries affecting a small set of rows, nested loop joins are far superior to both hash and merge

joins. As joins get more complicated, the other join types are used where appropriate. None of the join types is

by definition bad or wrong. You're primarily looking for places where the optimizer may have chosen a type not

compatible with the data in hand. This is usually caused by discrepancies in the statistics available to the optimizer

when it's deciding which of the types to use.

Hash Join

To understand SQL Server's hash join strategy, consider the following simple query:

SELECT p.*

FROM Production.Product p

JOIN Production.ProductCategory pc

ON p.ProductSubcategoryID = pc.ProductCategoryID;

Table 7-1 shows the two tables' indexes and number of rows.

Table 7-1. Indexes and Number of Rows of the Products and ProductCategory Tables

Table

Indexes

Number of Rows

Product

Clustered index on ProductID

504

ProductCategory

Clustered index on ProductCategoryld

4

Figure 7-7 shows the execution plan for the preceding query.

Figure 7-7. Execution plan with a hash join

You can see that the optimizer used a hash join between the two tables.

A hash join uses the two join inputs as a build input and a probe input. The build input is shown as the top

input in the execution plan, and the probe input is shown as the bottom input. Usually the smaller of the two inputs

serves as the build input because it's going to be stored on the system, so the optimizer attempts to minimize the

memory used.

Search WWH ::

Custom Search

Home