Physical Data Warehouse Design - Data Warehouse Systems: Design and Implementation

Database Reference

In-Depth Information

defined over an unpartitioned table, and vice versa, a partitioned table

may have unpartitioned indexes defined over it. Database vendors provide

several different partitioning methods, each of them having particular design

considerations.

There are two ways of partitioning a table: vertically and horizontally.

Vertical partitioning splits the attributes of a table into groups that can

be independently stored. For example, a table can be partitioned such that

the most often used attributes are stored in one partition, while other less

often used attributes are kept in another partition. In this way, more records

can be brought into main memory, reducing their processing time. On the

other hand, horizontal partitioning divides a table into smaller tables that

have the same structure than the full table but fewer records. For example,

if some queries require the most recent data while others access older data,

a fact table can be horizontally partitioned according to some time frame,

for example, years. An obvious advantage of this kind of partitioning is that

refreshing the data warehouse is more ecient since only the last partition

must be accessed.

In addition to the above, partitioning database tables into smaller data sets

facilitates administrative tasks, increases query performance especially when

parallel processing is applied, and enables access to a smaller subset of the

data (if the user's selection does not refer to all partitions). Also, partitioning

is recommended when the contents of a table need to be distributed across

different servers. In the case of indexes, partitioning is advised, for example,

in order to perform maintenance on parts of the data without invalidating

the entire index. Finally, we remark that from an application perspective, a

partitioned table is identical to a nonpartitioned table, thus partitioning is

transparent for writing SQL queries and DML statements.

We next analyze further some characteristics of partitioning.

7.7.1 Queries in Partitioned Databases

There are two classic techniques of partitioning related to query evaluation.

Partition pruning is the typical way of improving query performance using

partitioning, often producing performance enhancements of several orders of

magnitude. For example, a Sales fact table in a warehouse can be partitioned

by month. A query requesting orders for a single month only needs to access

the partition corresponding to such a month. If the Sales table contains 2

years of historical data, this query would access one partition instead of 24

ones, greatly reducing query response time.

The execution time of joins can also be enhanced by using partitioning.

This occurs when the two tables to be joined are partitioned on the join

attributes or, in the case of foreign key joins, when the reference table is

partitioned on its primary key. In these cases, a large join is broken into

Search WWH ::

Custom Search

Home