Physical Data Warehouse Design - Data Warehouse Systems: Design and Implementation

Database Reference

In-Depth Information

precalculated must be computed at run time, indexing methods are required

to ensure effective query processing. Traditional indexing techniques for

OLTP systems are not appropriate for multidimensional data. Indeed, most

OLTP transactions access only a small number of tuples, and the indexing

techniques used are designed for this situation. Since data warehouse queries

typically access a large number of tuples, alternative indexing mechanisms

are needed.

Two common types of indexes for data warehouses are bitmap indexes

and join indexes. Bitmap indexes are a special kind of index, particularly

useful for columns with a low number of distinct values (i.e., low cardinality

attributes), although several compression techniques eliminate this limita-

tion. On the other hand, join indexes materialize a relational join between

two tables by keeping pairs of row identifiers that participate in the join. In

data warehouses, join indexes relate the values of dimensions to rows in the

fact table. For example, given a fact table Sales and a dimension Client ,ajoin

index maintains for each client a list of row identifiers of the tuples recording

the sales to this client. Join indexes can be combined with bitmap indexes,

as we will see in this chapter.

Partitioning or fragmentation is a mechanism frequently used in

relational databases to reduce the execution time of queries. It consists

in dividing the contents of a relation into several files that can be more

eciently processed in this way. There are two ways of partitioning a relation:

vertically and horizontally. Vertical partitioning splits the attributes of a

table into groups that can be independently stored. For example, a table can

be partitioned such that the most often used attributes are stored in one

partition, while other less often used attributes are kept in another partition.

Also, column-store database systems (that will be studied in Chap. 13 )make

use of this technique. Horizontal partitioning divides the records of a

table into groups according to a particular criterion. A common horizontal

partitioning scheme in data warehouses is based on time, where each partition

contains data about a particular time period, for instance, a year or a range

of months.

In the following sections, we study these techniques in detail.

7.2 Materialized Views

We know that a view is a derived relation defined in terms of base relations

or other views, by means of the CREATE VIEW statement in SQL. A view

is recomputed every time it is invoked. A materialized view, on the other

hand, is a view that is physically stored in the database. This improves

query performance, playing the role of a cache which can be directly accessed

without looking into the base relations. But this benefit has a counterpart.

When the base relations are updated, the materialized views derived from

Search WWH ::

Custom Search

Home