Databases Reference
In-Depth Information
Block-Range Parallelism
Oracle has featured the ability to dynamically parallelize table scans and a variety of
scan-based functions since Oracle7. This parallelism is based on the notion of block
ranges , in which the Oracle server understands that each table contains a set of data
blocks that spans a defined range of data. Block-range parallelism is implemented by
dynamically breaking a table into pieces, each of which is a range of blocks, and then
using multiple processes to work on these pieces in parallel. Oracle's implementation of
block-range parallelism is unique in that it doesn't require physically partitioned tables
to achieve parallelism.
With block-range parallelism, the client session that issues the SQL statement trans‐
parently becomes the parallel execution coordinator, dynamically determining block
ranges and assigning them to a set of parallel execution (PE) processes. Once a PE
process has completed an assigned block range, it returns to the coordinator for more
work. Not all I/O occurs at the same rate, so some PE processes may process more blocks
than others. This notion of “stealing work” allows all processes to participate fully in
the task, providing maximum leverage of the machine resources.
Block-range parallelism scales linearly based on the number of PE processes if there are
adequate hardware resources. The key to achieving scalability with parallelism lies in
hardware basics. Each PE process runs on a CPU and requests I/O to a device. If you
have enough CPU processing power reading enough disks, parallelism will scale. If the
system encounters a bottleneck on one of these resources, scalability will suffer. For
example, four CPU cores reading two disks will not scale much beyond the two-way
scalability of the disks and may even sink below this level if the additional CPUs cause
contention for the disks. Similarly, two CPU cores reading 20 disks will not scale to a
20-fold performance improvement. The system hardware must be balanced for paral‐
lelism to scale.
Historically, most large systems had far more disks than CPU cores. In these systems,
parallelism results in a randomization of I/O across the I/O subsystem. This is useful
for concurrent access to data as PE processes for different users read from different disks
at different times, resulting in I/O that is distributed across the available disks.
A useful analogy for dynamic parallelism is eating a pie. The pie is the set of blocks to
be read for the operation, and the goal is to eat the pie as quickly as possible using a
certain number of people. Oracle serves the pie in helpings, and when a person finishes
his first helping, they can come back for more. Not everyone eats at the same rate, so
some people will consume more pie than others. While this approach in the real world
is somewhat unfair, it's a good model for parallelism because if everyone is eating all the
time, the pie will be consumed more quickly. The alternative is to give each person an
equal serving and wait for the slower eaters to finish.
Figure 7-3 illustrates the splitting of a set of blocks into ranges.
Search WWH ::




Custom Search