Database Reference
In-Depth Information
Now when you add back in the
WHERE
clause to appropriately filter the data, the resultant
STATISTICS IO
output
turns out to be this:
(2 row(s) affected)
Table 'Product'. Scan count 0, logical reads 4...
Table 'SalesOrderDetail'. Scan count 1, logical reads 3...
Table 'SalesOrderHeader'. Scan count 0, logical reads 3...
(1 row(s) affected)
Logical reads for the
SalesOrderDetail
table have been cut from 1,246 to 3 because of the index seek and the
loop join. It also hasn't significantly affected the data retrieval cost of the
Product
table.
While interpreting the output of
STATISTICS IO
, you mostly refer to the number of logical reads. The number of
physical reads and read-ahead reads will be nonzero when the data is not found in the memory, but once the data is
populated in memory, the physical reads and read-ahead reads will tend to be zero.
There is another advantage to knowing all the tables used and their corresponding reads for a query. Both the
duration
and
CPU
values may fluctuate significantly when reexecuting the same query with no change in table schema
(including indexes) or data because the essential services and background applications running on the SQL Server
machine can affect the processing time of the query under observation. But, don't forget that logical reads are not
always the most accurate measure. Duration and CPU are absolutely useful and an important part of any query tuning.
During optimization steps, you need a nonfluctuating cost figure as a reference. The reads (or logical reads)
don't vary between multiple executions of a query with a fixed table schema and data. For example, if you execute the
previous
SELECT
statement ten times, you will probably get ten different figures for
duration
and
CPU
, but
Reads
will
remain the same each time. Therefore, during optimization, you can refer to the number of reads for an individual
table to ensure that you really have reduced the data access cost of the table. Just never assume that is your only
measure or even the primary one. It's just a constant measure and therefore useful.
Even though the number of logical reads can also be obtained from the Extended Events, you get another benefit
when using
STATISTICS IO
. The number of logical reads for a query shown by Profiler or the Server Trace option
increases as you use different
SET
statements (mentioned previously) along with the query. But the number of logical
reads shown by
STATISTICS IO
doesn't include the additional pages that are accessed as
SET
statements are used with
a query. Thus,
STATISTICS IO
provides a consistent figure for the number of logical reads.
Summary
In this chapter, you saw that you can use Extended Events to identify the queries causing a high amount of stress on
the system resources in a SQL workload. Collecting the session data can, and should be, automated using system
stored procedures. For immediate access to statistics about running queries, use the DMV sys.dm_exec_query_stats.
You can further analyze these queries with Management Studio to find the costly steps in the processing strategy
of the query. For better performance, it is important to consider both the index and join mechanisms used in an
execution plan while analyzing a query. The number of data retrievals (or reads) for the individual tables provided by
SET STATISTICS IO
helps concentrate on the data access mechanism of the tables with most number of reads. You
also should focus on the CPU cost and overall time of the most costly queries.
Once you identify a costly query and finish the initial analysis, the next step should be to optimize the query
for performance. Because indexing is one of the most commonly used performance-tuning techniques, in the next
chapter I will discuss in depth the various indexing mechanisms available in SQL Server.