Database Reference
In-Depth Information
Procedural Parallelism
I would like to discuss two types of procedural parallelism:
Parallel pipelined functions, which is a feature of Oracle.
Do-it-yourself (DIY) parallelism, which is the application to your own applications of the
same techniques that Oracle applies to parallel full table scans. DIY parallelism is more
of a development technique than anything built into Oracle directly in Oracle Database
11 g Release 1 and before - and a native database feature in Oracle Database 11 g
Release 2 and later.
Often you'll find that applications—typically batch processes—designed to execute serially will look something
like the following procedure:
Create procedure process_data
As
Begin
For x in ( select * from some_table )
Perform complex process on X
Update some other table, or insert the record somewhere else
End loop
end
In this case, Oracle's parallel query or PDML won't help a bit (in fact, parallel execution of the SQL by Oracle
here would likely only cause the database to consume more resources and take longer). If Oracle were to execute
the simple SELECT * FROM SOME_TABLE in parallel, it would provide this algorithm no apparent increase in speed
whatsoever. If Oracle were to perform in parallel the UPDATE or INSERT after the complex process, it would have no
positive affect (it is a single-row UPDATE / INSERT , after all).
There is one obvious thing you could do here: use array processing for the UPDATE / INSERT after the complex
process. However, that isn't going to give you a 50 percent reduction or more in runtime, which is often what you're
looking for. Don't get me wrong, you definitely want to implement array processing for the modifications here, but it
won't make this process run two, three, four, or more times faster.
Now, suppose this process runs at night on a machine with four CPUs, and it is the only activity taking place. You
have observed that only one CPU is partially used on this system, and the disk system is not being used very much at
all. Further, this process is taking hours, and every day it takes a little longer as more data is added. You need to reduce
the runtime dramatically—it needs to run four or eight times faster—so incremental percentage increases will not be
sufficient. What can you do?
There are two approaches you can take. One approach is to implement a parallel pipelined function, whereby
Oracle will decide on appropriate degrees of parallelism (assuming you have opted for that, which would be
recommended). Oracle will create the sessions, coordinate them, and run them, very much like the previous example
with parallel DDL or parallel DML where, by using CREATE TABLE AS SELECT or INSERT /*+ APPEND */ , Oracle
fully automated parallel direct path loads for us. The other approach is DIY parallelism. We'll take a look at both
approaches in the sections that follow.
Parallel Pipelined Functions
We'd like to take that very serial process PROCESS_DATA from earlier and have Oracle execute it in parallel for us.
To accomplish this, we need to turn the routine inside out. Instead of selecting rows from some table, processing
them, and inserting them into another table, we will insert into another table the results of fetching some rows and
processing them. We will remove the INSERT at the bottom of that loop and replace it in the code with a PIPE ROW
clause. The PIPE ROW clause allows our PL/SQL routine to generate table data as its output, so we'll be able to SELECT
 
Search WWH ::




Custom Search