TRENDS IN DATABASE TECHNOLOGY - Database Design and Development: An Essential Guide for IT Professionals

Database Reference

In-Depth Information

Parallel Processing Options Parallel processing options in database software are

intended only for machines with multiple processors. Most of the current database

software can parallelize a large number of operations. These operations include the

following: mass loading of data, full table scans, queries with exclusion conditions,

queries with grouping, selection with distinct values, aggregation, sorting, creation of

tables using subqueries, creating and rebuilding indexes, inserting rows into one table

from other tables, enabling constraints, star transformation, which is an optimization

technique when processing queries against a STAR schema, and so on. Note that this

is an impressive list of operations that the RDBMS can process in parallel.

Let us now examine what happens when a user initiates a query at the work-

station. Each session accesses the database through a server process. The query is

sent to the DBMS, and data retrieval takes place from the database. Data are

retrieved and the results are sent back, all under the control of the dedicated server

process. The query dispatcher software is responsible for splitting the work and

distributes the units to be performed among the pool of available query server

processes to balance the load. Finally, the results of the query processes are assem-

bled and returned as a single, consolidated result set.

Interquery Parallelization In this method, several server processes handle multi-

ple requests simultaneously. Multiple queries may be serviced based on your server

configuration and the number of available processors. You may take advantage of

this feature of the DBMS successfully on SMP systems, thereby increasing through-

put and supporting more concurrent users.

However, interquery parallelism is limited. Let us see what happens here.

Multiple queries are processed concurrently, but each query is still being processed

serially by a single server process. Suppose that a query consists of index read, data

read, sort, and join operations; then these operations are carried out in this order.

Each operation must finish before the next one can begin. Parts of the same query

do not execute in parallel. To overcome this limitation, many DBMS vendors have

come up with versions of their products that provide intraquery parallelization.

Intraquery Parallelization Using the intraquery parallelization technique, the

DBMS splits the query into the lower-level operations of index read, data read, join,

and sort. Then each one of these base operations is executed in parallel on a single

processor. The final result set is the consolidation of the intermediary results. Let

us review three ways a DBMS can provide intraquery parallelization, that is, paral-

lelization of parts of the operations within the same query itself.

Horizontal parallelism. The data are partitioned across multiple disks. Parallel pro-

cessing occurs within each single task in the query, for example, data read, which is

performed on multiple processors concurrently on different sets of data to be read

spread across multiple disks. After the first task is completed from all of the rele-

vant parts of the partitioned data, the next task of that query is carried out, and then

the next one after that task, and so on.

Vertical parallelism. This kind of parallelism occurs among different tasks, not just

a single task in a query as in the case of horizontal parallelism. All component query

Database Design and Development: An Essential Guide for IT Professionals

Search WWH ::

Custom Search

Home