Database Reference
In-Depth Information
Parallel Processing Options Parallel processing options in database software are
intended only for machines with multiple processors. Most of the current database
software can parallelize a large number of operations. These operations include the
following: mass loading of data, full table scans, queries with exclusion conditions,
queries with grouping, selection with distinct values, aggregation, sorting, creation of
tables using subqueries, creating and rebuilding indexes, inserting rows into one table
from other tables, enabling constraints, star transformation, which is an optimization
technique when processing queries against a STAR schema, and so on. Note that this
is an impressive list of operations that the RDBMS can process in parallel.
Let us now examine what happens when a user initiates a query at the work-
station. Each session accesses the database through a server process. The query is
sent to the DBMS, and data retrieval takes place from the database. Data are
retrieved and the results are sent back, all under the control of the dedicated server
process. The query dispatcher software is responsible for splitting the work and
distributes the units to be performed among the pool of available query server
processes to balance the load. Finally, the results of the query processes are assem-
bled and returned as a single, consolidated result set.
Interquery Parallelization In this method, several server processes handle multi-
ple requests simultaneously. Multiple queries may be serviced based on your server
configuration and the number of available processors. You may take advantage of
this feature of the DBMS successfully on SMP systems, thereby increasing through-
put and supporting more concurrent users.
However, interquery parallelism is limited. Let us see what happens here.
Multiple queries are processed concurrently, but each query is still being processed
serially by a single server process. Suppose that a query consists of index read, data
read, sort, and join operations; then these operations are carried out in this order.
Each operation must finish before the next one can begin. Parts of the same query
do not execute in parallel. To overcome this limitation, many DBMS vendors have
come up with versions of their products that provide intraquery parallelization.
Intraquery Parallelization Using the intraquery parallelization technique, the
DBMS splits the query into the lower-level operations of index read, data read, join,
and sort. Then each one of these base operations is executed in parallel on a single
processor. The final result set is the consolidation of the intermediary results. Let
us review three ways a DBMS can provide intraquery parallelization, that is, paral-
lelization of parts of the operations within the same query itself.
Horizontal parallelism. The data are partitioned across multiple disks. Parallel pro-
cessing occurs within each single task in the query, for example, data read, which is
performed on multiple processors concurrently on different sets of data to be read
spread across multiple disks. After the first task is completed from all of the rele-
vant parts of the partitioned data, the next task of that query is carried out, and then
the next one after that task, and so on.
Vertical parallelism. This kind of parallelism occurs among different tasks, not just
a single task in a query as in the case of horizontal parallelism. All component query
Search WWH ::




Custom Search