Physical Data Warehouse Design - Data Warehouse Systems: Design and Implementation

Database Reference

In-Depth Information

smaller joins that occur between each of the partitions, producing significant

performance gains, which can be even improved taking advantage of parallel

execution.

7.7.2 Managing Partitioned Databases

Partitioning also improves the job of database and data warehouse adminis-

trators, since tables and indexes are partitioned into smaller, more manage-

able pieces of data. In this way, maintenance operations can be performed

on these particular portions of tables. For example, a database administrator

may back up just a single partition of a table instead of the whole one. In addi-

tion, partitioned database tables and indexes induce high data availability.

For example, if some partitions of a table become unavailable, it is possible

that most of the other partitions of the table remain on-line and available, in

particular if partitions are allocated to various different devices. In this way,

applications can continue to execute queries and transactions that do not

need to access the unavailable partitions. Even during normal operation, since

each partition can be stored in separate tablespaces, backup and recovery

operations can be performed over individual partitions, independent from

each other. Thus, the active parts of the database can be made available

sooner than in the case of an unpartitioned table.

7.7.3 Partitioning Strategies

There are three most common partitioning strategies in database systems:

range partitioning, hash partitioning, and list partitioning.

The most usual type of partitioning is range partitioning , which maps

records to partitions based on ranges of values of the partitioning key. The

temporal dimension is a natural candidate for range partitioning, although

other attributes can be used. For example, if a table contains a date column

defined as the partitioning key, the January 2012 partition will contain rows

with key values from January 1, 2012, to January 31, 2012.

Hash partitioning maps records to partitions based on a hashing

algorithm applied to the partitioning key. The hashing algorithm distributes

rows among partitions in a uniform fashion, yielding, ideally, partitions of the

same size. This is typically used when partitions are distributed in several

devices and, in general, when data are not partitioned based on time since it

is more likely to yield even record distribution across partitions.

Finally, list partitioning enables to explicitly control how rows are

mapped to partitions specifying a list of values for the partitioning key. In

this way, data can be organized in an ad hoc fashion.

Search WWH ::

Custom Search

Home