Databases Reference
In-Depth Information
QUERY OPTIMIZATION
The job of the Query Optimizer is to take the query tree that was output from the algebrizer and
i nd a “good” way to retrieve the data (results) needed. Note the use of “good” here, rather than
“best,” as for any nontrivial query, there may be hundreds, or even thousands, of different ways to
achieve the same results, so i nding the absolutely best one can be an extremely time-consuming
process. Therefore, in order to provide results in a timely manner, the Query Optimizer looks for
a “good enough” plan, and uses that. This approach means that you may very well be able to do
better when you manually inspect the query plan; and in the section “Inl uencing Optimization” you
will look at different ways you can affect the decisions that SQL Server makes during optimization.
The query optimization process is based on a principle of cost , which is an abstract measure of work
that is used to evaluate different query plan options. The exact nature of these costs is a closely
guarded secret, with some people suggesting that they are a rel ection of the time, in seconds, that
the query is expected to take. They also take into account I/O and CPU resources. However, users
should consider cost to be a dimensionless value that doesn't have any units — its value is derived
from comparisons to the cost of other plans in order to i nd the cheapest one. Therefore, there are
no true units for cost values.
Although the exact details of what SQL Server does within the optimization phase are secret, it's
possible to get a glimpse at some of what goes on. For the purposes of this topic, you don't need to
know every small detail, and in fact such a deep understanding isn't useful anyway. For one thing,
there is nothing you can do to alter this process; moreover, with each new service pack or hoti x,
the SQL Server team tunes the internal algorithms, thereby changing the exact behavior. If you
were to know too much about what was occurring, you could build in dependencies that would
break with every new version of SQL Server.
Rather than know all the details, you need only understand the bigger picture. Even this bigger
picture is often too much information, as it doesn't offer any real visibility into what the Query
Optimizer is doing. All you can see of this secretive process is what is exposed in the Dynamic
Management View (DMV) sys.dm_exec_query_optimizer_info . This can be interesting, but it's
not a great deal of help in understanding why a given T-SQL statement is assigned a particular plan,
or how you can “i x” what you think may be a non-optimal plan.
The current model provided by the SQL Server team works something like this:
Is a valid plan cached? If yes, then use the cached plan. If no plan exists, then continue.
Is this a trivial plan? If yes, then use the trivial plan. If no, then continue.
Apply simplii cation. Simplii cation is a process of normalizing the query tree and applying
some basic transformations to additionally “simplify” the tree.
Is the plan cheap enough? If yes, then use this. If no, then start optimization.
Start cost-based optimization.
Phase 0 Explore basic rules, and hash and nested join options.
Does the plan have a cost of less than 0.2? If yes, then use this. If no, then continue.
 
Search WWH ::




Custom Search