Execution Plan Generation - SQL Server Query Performance Tuning

Database Reference

In-Depth Information

Trivial Plan Match

Sometimes there might be only one way to execute a query. For example, a heap table with no indexes can be

accessed in only one way: via a table scan. To avoid the runtime overhead of optimizing such queries, SQL Server

maintains a list of patterns that define a trivial plan. If the optimizer finds a match, then a similar plan is generated

for the query without any optimization. The generated plans are then stored in the procedure cache. Eliminating the

optimization phase means that the cost for generating a trivial plan is very low. This is not to imply that trivial plans

are desired or preferable to more complex plans. Trivial plans are available only for extremely simple queries. Once

the complexity of the query rises, it must go through optimization.

Multiple Optimization Phases

For a nontrivial query, the number of alternative processing strategies to be analyzed can be high, and it may take a

long time to evaluate each option. Therefore, the optimizer goes through three different levels of optimizations. These

are referred to as search 0, search 1, and search 2. But it's easier to think of them as transaction , quick plan , and full

optimization . Depending on the size and complexity of the query, these different optimizations may be tried one at

a time, or the optimizer might skip straight to full optimization. Each of the optimizations takes into account using

different join techniques and different ways of accessing the data through scans, seeks, and other operations.

The index variations consider different indexing aspects, such as single-column index, composite index, index

column order, column density, and so forth. Similarly, the join variations consider the different join techniques available

in SQL Server: nested loop join, merge join, and hash join. (Chapter 4 covers these join techniques in detail.) Constraints

such as unique values and foreign key constraints are also part of the optimization decision-making process.

The optimizer considers the statistics of the columns referred to in the WHERE clause to evaluate the effectiveness

of the index and the join strategies. Based on the current statistics, it evaluates the cost of the configurations in

multiple optimization phases. The cost includes many factors, including (but not limited to) usage of CPU, memory,

and disk I/O (including random versus sequential I/O estimation) required to execute the query. After each

optimization phase, the optimizer evaluates the cost of the processing strategy. This cost is an estimation only, not

an actual measure or prediction of behavior; it's a mathematical construct based on the statistics and the processes

under consideration. If the cost is found to be cheap enough, then the optimizer stops further iteration through the

optimization phases and quits the optimization process. Otherwise, it keeps iterating through the optimization phases

to determine a cost-effective processing strategy.

Sometimes a query can be so complex that the optimizer needs to extensively iterate through the optimization

phases. While optimizing the query, if it finds that the cost of the processing strategy is more than the cost threshold

for parallelism, then it evaluates the cost of processing the query using multiple CPUs. Otherwise, the optimizer

proceeds with the serial plan.

You can find out some detail of what occurred during the multiple optimization phases via two sources. Take, for

example, this query:

SELECT soh.SalesOrderNumber,

sod.OrderQty,

sod.LineTotal,

sod.UnitPrice,

sod.UnitPriceDiscount,

p.[Name] AS ProductName,

p.ProductNumber,

ps.[Name] AS ProductSubCategoryName,

pc.[Name] AS ProductCategoryName

FROM Sales.SalesOrderHeader AS soh

JOIN Sales.SalesOrderDetail AS sod

ON soh.SalesOrderID = sod.SalesOrderID

SQL Server Query Performance Tuning

Search WWH ::

Custom Search

Home