Databases Reference
In-Depth Information
would be a query that needs to return the departments that have never had an employee. The queries
using NOT IN and NOT EXISTS structures would look like this:
SELECT Name, DepartmentId
FROM HumanResources.Department
WHERE DepartmentId NOT IN
(SELECT DepartmentId FROM HumanResources.EmployeeDepartmentHistory)
SELECT dpt.Name, dpt.DepartmentId
FROM HumanResources.Department dpt
WHERE NOT EXISTS
(SELECT * FROM HumanResources.EmployeeDepartmentHistory hist
WHERE hist.DepartmentId = Dpt.DepartmentId)
If you run these two queries side by side, you'll see that they are both 50 percent of the batch, meaning
that they are the same costs from a predictive standpoint. Both the XML query plans show the exact same
graphical cost components. More detailed information can be found by setting the SHOWPLAN_ALL
option to ON and rerunning the query. For both you'll also get the same detailed results. Here we've
abbreviated some of the contents to focus on the important aspects.
SELECT ... WHERE DepartmentId NOT IN (SELECT DepartmentId FROM...
|--Nested Loops(Left Anti Semi Join,
OUTER REFERENCES:([Department].[DepartmentID]))
|--Clustered Index Scan(OBJECT:([Department].[PK_Department_DepartmentID]))
|--Top(TOP EXPRESSION:((1)))
|--Index Seek(OBJECT:([IX_EmployeeDepartmentHistory_DepartmentID]),
SEEK:([EmployeeDepartmentHistory].[DepartmentID]=[Department].[DepartmentID])
ORDERED FORWARD)
The results of the SHOWPLAN_ALL are read from the inner sections represented with the nested lines
out to the left to the outer sections. In this plan it is not nested, so you can simply read from the bottom
to the top. In this level of detail, you can see that the optimizer has chosen the best implementation for
these query requests. First a SEEK is performed into the EmployeeDepartmentHistory table using the
non-clustered index, which is the most narrow and contains the DepartmentId that is needed to join
back to the Department Table. The only clustered index scan is against the Department table and this
is because the table is so small it completely fits in a page. This is an example of how spending time to
rewrite statements doesn't provide any tangible improvement. Along similar lines, the optimizer can also
determine the best implementation between the IN and EXISTS styles of writing queries. We'll leave that
proof to you to perform.
Rewriting by Pushing Predicates Deeper into Plans
Although the optimizer is getting better, it still has some blind spots if you aren't specific enough. We'll
be going over other predicate-level changes in this chapter, and pushing predicates deeper is one of
the things that require your specific communication help through a T-SQL rewrite. The principle is that
predicates should be able to be applied as early as possible in a T-SQL statement. During the query
processing phases the WHERE statements are processed before the HAVING statements. You should
take any opportunity to reduce the row count at an earlier opportunity to save on some costs of effort.
A metaphor to this is getting all the different boxes of cereal in a grocery aisle and then putting all of
Search WWH ::




Custom Search