Database Reference
In-Depth Information
In general, parallel execution works best when given access to as many resources (CPU, memory, and I/O) as
possible. However, that is not to say that nothing can be gained from parallel query if the entire set of data were on a
single disk, but you would perhaps not gain as much as would be gained using multiple disks. The reason you would
likely gain some speed in response time, even when using a single disk, is that when a given parallel execution server
is counting rows, it is not reading them, and vice versa. So, two parallel execution servers may well be able to complete
the counting of all rows in less time than a serial plan would.
Likewise, you can benefit from parallel query even on a single CPU machine. It is doubtful that a serial
SELECT COUNT(*) would use 100 percent of the CPU on a single CPU machine—it would be spending part of its
time performing (and waiting for) physical I/O to disk. Parallel query would allow you to fully utilize the resources
(the CPU and I/O, in this case) on the machine, whatever those resources may be.
That final point brings us back to the earlier quote from Practical Oracle8i: Building Efficient Databases : parallel
query is essentially nonscalable. If you allowed four sessions to simultaneously perform queries with two parallel
execution servers on that single CPU machine, you would probably find their response times to be longer than if they
just processed serially. The more processes clamoring for a scarce resource, the longer it will take to satisfy all requests.
And remember, parallel query requires two things to be true. First, you need to have a large task to perform—for
example, a long-running query, the runtime of which is measured in minutes, hours, or days, not in seconds or
subseconds. This implies that parallel query is not a solution to be applied in a typical OLTP system, where you are not
performing long-running tasks. Enabling parallel execution on these systems is often disastrous.
Second, you need ample free resources such as CPU, I/O, and memory. If you are lacking in any of these, then
parallel query may well push your utilization of that resource over the edge, negatively impacting overall performance
and runtime.
In Oracle 11 g Release 2 and above, a new bit of functionality has been introduced to try and limit this over
commitment of resources: Parallel Statement Queuing (PSQ). When using PSQ, the database will limit the number
of concurrently executing parallel queries—and place any further parallel requests in an execution queue. When
the CPU resources are exhausted (as measured by the number of parallel execution servers in concurrent use), the
database will prevent new requests from becoming active. These requests will not fail—rather they will have their start
delayed, they will be queued. As resources become available (as parallel execution servers that were in use finish their
tasks and become idle), the database will begin to execute the queries in the queue. In this fashion, as many parallel
queries as make sense can run concurrently, without overwhelming the system, while subsequent requests politely
wait their turn. In all, everyone gets their answer faster, but a waiting line is involved.
Parallel query was once considered mandatory for many data warehouses simply because in the past (say, in
1995), data warehouses were rare and typically had a very small, focused user base. Today, data warehouses are
literally everywhere and support user communities that are as large as those found for many transactional systems.
This means that you might not have sufficient free resources at any given point in time to enable parallel query on
these systems. This doesn't mean parallel execute is not useful in this case—it just might be more of a DBA tool, as
we'll see in the section “Parallel DDL,” rather than a parallel query tool.
Parallel DML
The Oracle documentation limits the scope of parallel DML (PDML) to include only INSERT , UPDATE , DELETE , and
MERGE (it does not include SELECT as normal DML does). During PDML, Oracle may use many parallel execution
servers to perform your INSERT , UPDATE , DELETE , or MERGE instead of a single serial process. On a multi-CPU machine
with plenty of I/O bandwidth, the potential increase in speed may be large for mass DML operations.
However, you should not look to PDML as a feature to speed up your OLTP-based applications. As stated previously,
parallel operations are designed to fully and totally maximize the utilization of a machine. They are designed so that a
single user can completely use all of the disks, CPU, and memory on the machine. In a certain data warehouse (with lots
of data and few users), this is something you may want to achieve. In an OLTP system (with a lot of users all doing short,
fast transactions), you do not want to give a user the ability to fully take over the machine resources.
 
Search WWH ::




Custom Search