Database Reference
In-Depth Information
Ra
id a
1
R1 = σ a < 4 (R)
id a b
1 2 10
5 2 40
6 1 10
R 2 = σ 4 a < 6 (R )
id a b
2 4 20
4 5 30
R3 = σ 6 a (R)
id a
2
2
4
3
8
R
id a
4
5
b
5
2
1
2
10
6
1
2
4
20
3
8
20
Rb
id
4
5
30
b
5
2
40
1
10
6
1
10
2
20
3
20
b
4
30
3
8
20
5
40
6
10
Q= SELECT b
FROM R
WHERE a < 5
Q= SELECT b
FROM Ra, Rb
WHERE Ra.id = Rb.id
ANDa<5
Q= SELECT b FROM R1
UNION ALL
SELECT b FROM R2
WHEREa<5
(a) Original table.
(b) Vertical partitions.
(c) Horizontal partitions.
FIGURE 9.1
Vertical and horizontal table partitions.
9.1.2 Recommending Data Partitioning
The diversity of partitioning schemes supported in different DBMSs (e.g.,
vertical vs. horizontal, range vs. hash based) resulted in extensions to the
physical design problem that handle specific scenarios. We next summarize two
techniques for recommending data partitioning. For presentation purposes, we
introduce both techniques in the context of the basic bottom-up scheme that
we discussed in Chapters 4 and 6. Specifically, a candidate selection phase first
picks partitioning candidates. A subsequent merging step identifies additional
partitioning schemes that, while suboptimal for every query in the workload,
might be part of the best configuration. Finally, an enumeration step traverses
the resulting search space and identifies the best solution.
9.1.2.1
Hash-Based Partitioning in a Parallel DBMS
A parallel DBMS exploits fast and inexpensive processors to improve perfor-
mance by connecting independent servers via high-speed networks. Each pro-
cessor stores a portion of the database locally on disk, and query processing
attempts to perform as much local work as possible, thus minimizing slower
data transfer among servers. These systems thus partition data horizontally,
and we are interested in recommending a partitioning scheme that maximizes
workload performance. Subsequent index recommendations (if needed) can
Search WWH ::




Custom Search