Database Reference
In-Depth Information
Latch “Spinning”
One thing I'd like to drive home with regard to latches is this: latches are a type of lock, locks are serialization devices,
and serialization devices inhibit scalability. If your goal is to construct an application that scales well in an Oracle
environment, you must look for approaches and solutions that minimize the amount of latching you need to perform.
Even seemingly simple activities, such as parsing a SQL statement, acquire and release hundreds or thousands of
latches on the library cache and related structures in the shared pool. If we have a latch, then someone else might be
waiting for it. When we go to get a latch, we may well have to wait for it ourselves.
Waiting for a latch can be an expensive operation. If the latch is not available immediately and we are willing to
wait for it, as we likely are most of the time, then on a multi-CPU machine our session will spin, trying over and over,
in a loop, to get the latch. The reasoning behind this is that context switching (i.e., getting kicked off the CPU and
having to get back on the CPU) is expensive. So, if the process cannot get a latch immediately, we'll stay on the CPU
and try again immediately rather than just going to sleep, giving up the CPU, and trying later when we'll have to get
scheduled back on the CPU. The hope is that the holder of the latch is busy processing on the other CPU (and since
latches are designed to be held for very short periods of time, this is likely) and will give it up soon. If after spinning
and constantly trying to get the latch, we still fail to obtain it, only then will our process sleep, or take itself off of
the CPU, and let some other work take place. This sleep action is usually the result of many sessions concurrently
requesting the same latch; it is not that a single session is holding it for a long time, but rather that so many sessions
want it at the same time and each hold it for a short duration. If you do something short (fast) often enough, it adds
up! The pseudo-code for a latch get might look like this:
Loop
for i in 1 .. 2000
loop
try to get latch
if got latch, return
if i = 1 then misses=misses+1
end loop
INCREMENT WAIT COUNT
sleep
Add WAIT TIME
End loop;
The logic is to try to get the latch and, failing that, to increment the miss count, a statistic we can see in a
Statspack report or by querying the V$LATCH view directly. Once the process misses, it will loop some number of times
(an undocumented parameter controls the number of times and is typically set to 2,000), attempting to get the latch
over and over. If one of these get attempts succeeds, then it returns and we continue processing. If they all fail, the
process will go to sleep for a short duration of time, after incrementing the sleep count for that latch. Upon waking
up, the process begins all over again. This implies that the cost of getting a latch is not just the “test and set”-type
operation that takes place, but also a considerable amount of CPU while we try to get the latch. Our system will appear
to be very busy (with much CPU being consumed), but not much work is getting done.
Measuring the Cost of Latching a Shared Resource
As an example, we'll study the cost of latching the shared pool. We'll compare a well-written program (one that uses
bind variables) and a program that is not so well written (it uses literal SQL, or unique SQL for each statement).
To do this, we'll use a very small Java program that simply logs into Oracle, turns off auto-commit (as all Java programs
should do immediately after connecting to a database), and executes 25,000 unique INSERT statements in a loop.
We'll perform two sets of tests: our program will not use bind variables in the first set, and in the second set it will.
 
Search WWH ::




Custom Search