Parallel Execution - Expert Oracle Database Architecture

Database Reference

In-Depth Information

Procedural Parallelism

I would like to discuss two types of procedural parallelism:

•

Parallel pipelined functions, which is a feature of Oracle.

•

Do-it-yourself (DIY) parallelism, which is the application to your own applications of the

same techniques that Oracle applies to parallel full table scans. DIY parallelism is more

of a development technique than anything built into Oracle directly in Oracle Database

11 g Release 1 and before - and a native database feature in Oracle Database 11 g

Release 2 and later.

Often you'll find that applications—typically batch processes—designed to execute serially will look something

like the following procedure:

Create procedure process_data

As

Begin

For x in ( select * from some_table )

Perform complex process on X

Update some other table, or insert the record somewhere else

End loop

end

In this case, Oracle's parallel query or PDML won't help a bit (in fact, parallel execution of the SQL by Oracle

here would likely only cause the database to consume more resources and take longer). If Oracle were to execute

the simple SELECT * FROM SOME_TABLE in parallel, it would provide this algorithm no apparent increase in speed

whatsoever. If Oracle were to perform in parallel the UPDATE or INSERT after the complex process, it would have no

positive affect (it is a single-row UPDATE / INSERT , after all).

There is one obvious thing you could do here: use array processing for the UPDATE / INSERT after the complex

process. However, that isn't going to give you a 50 percent reduction or more in runtime, which is often what you're

looking for. Don't get me wrong, you definitely want to implement array processing for the modifications here, but it

won't make this process run two, three, four, or more times faster.

Now, suppose this process runs at night on a machine with four CPUs, and it is the only activity taking place. You

have observed that only one CPU is partially used on this system, and the disk system is not being used very much at

all. Further, this process is taking hours, and every day it takes a little longer as more data is added. You need to reduce

the runtime dramatically—it needs to run four or eight times faster—so incremental percentage increases will not be

sufficient. What can you do?

There are two approaches you can take. One approach is to implement a parallel pipelined function, whereby

Oracle will decide on appropriate degrees of parallelism (assuming you have opted for that, which would be

recommended). Oracle will create the sessions, coordinate them, and run them, very much like the previous example

with parallel DDL or parallel DML where, by using CREATE TABLE AS SELECT or INSERT /*+ APPEND */ , Oracle

fully automated parallel direct path loads for us. The other approach is DIY parallelism. We'll take a look at both

approaches in the sections that follow.

Parallel Pipelined Functions

We'd like to take that very serial process PROCESS_DATA from earlier and have Oracle execute it in parallel for us.

To accomplish this, we need to turn the routine inside out. Instead of selecting rows from some table, processing

them, and inserting them into another table, we will insert into another table the results of fetching some rows and

processing them. We will remove the INSERT at the bottom of that loop and replace it in the code with a PIPE ROW

clause. The PIPE ROW clause allows our PL/SQL routine to generate table data as its output, so we'll be able to SELECT

Expert Oracle Database Architecture

Search WWH ::

Custom Search

Home