Database Reference
In-Depth Information
DB2
GETPAGE
Oracle
Get or LIO
SQL Server
Logical read
PR = Page Request
1
ET > 200 ms Slow SELECT
2
ET/PR < 10 ms Possible culprit
3
PR/F > 5 Probable promising culprit
Figure 7.11 Filtering slow
SQL calls.
Starting with the SQL calls that have the longest elapsed time (ET), in order
to improve the response time by access path tuning, we should try to answer the
same two questions that were discussed earlier at the LRT level.
First, does the SQL call appear to be a culprit? In other words, is the ET too
long because the service time is too long? There are several ways to approach
this question. A high CPU time indicates a long service time, and so does a high
value for READS; but, in fact, PR seems to be the most useful indicator for
service time because it correlates both to CPU and to READS, and is to a large
extent repeatable; it does not depend on the system load or warm-up effects.
To identify the SQL calls that are likely to be culprits, we should ignore
calls that have a low PR. If a SELECT takes 1 s and issues less than 100 page
requests, it is probably a victim. In the worst possible case, each of these would
result in a random read; this would take 100 × 10 ms = 1 s if drive queuing
is not excessive. In reality, many of the page requests would be satisfied from
the database buffer pool and some from the disk server read cache; remember
that the reported PR includes nonleaf index pages; also many page requests are
sequential. The expected I/O time for the 100 page requests is much less than
one second and so is the expected CPU time. All in all, if the measured elapsed
time for an SQL call with 100 page requests is one second, it is very probable
that the queuing time (disk drive queuing, CPU queuing, lock waits, or other
waits) is longer than the service time . In general, if the elapsed time (ET) for
a slow SELECT is more than the number of page requests (PR) × 10 ms, the
SELECT is probably a victim; it belongs to the first subset (1) in Figure 7.11
but not to the second subset (2).
The possible culprit subset (2) consists of SQL calls whose ET/PR is not
very long; they are not probable or obvious victims. However, subset (2) may
contain many victims as well; for instance, calls that do sequential processing
(low service time per page), which are slow due to excessive CPU queuing time.
If we were aware of any lock wait or drive and CPU queuing time, as we would
be with LRT-level spike reports, we could filter out most of these victims. As it
is, we only have the four values, and so unfortunately the second subset (2) may
Search WWH ::




Custom Search