Database Reference
In-Depth Information
precalculated must be computed at run time, indexing methods are required
to ensure effective query processing. Traditional indexing techniques for
OLTP systems are not appropriate for multidimensional data. Indeed, most
OLTP transactions access only a small number of tuples, and the indexing
techniques used are designed for this situation. Since data warehouse queries
typically access a large number of tuples, alternative indexing mechanisms
are needed.
Two common types of indexes for data warehouses are bitmap indexes
and join indexes. Bitmap indexes are a special kind of index, particularly
useful for columns with a low number of distinct values (i.e., low cardinality
attributes), although several compression techniques eliminate this limita-
tion. On the other hand, join indexes materialize a relational join between
two tables by keeping pairs of row identifiers that participate in the join. In
data warehouses, join indexes relate the values of dimensions to rows in the
fact table. For example, given a fact table Sales and a dimension Client ,ajoin
index maintains for each client a list of row identifiers of the tuples recording
the sales to this client. Join indexes can be combined with bitmap indexes,
as we will see in this chapter.
Partitioning or fragmentation is a mechanism frequently used in
relational databases to reduce the execution time of queries. It consists
in dividing the contents of a relation into several files that can be more
eciently processed in this way. There are two ways of partitioning a relation:
vertically and horizontally. Vertical partitioning splits the attributes of a
table into groups that can be independently stored. For example, a table can
be partitioned such that the most often used attributes are stored in one
partition, while other less often used attributes are kept in another partition.
Also, column-store database systems (that will be studied in Chap. 13 )make
use of this technique. Horizontal partitioning divides the records of a
table into groups according to a particular criterion. A common horizontal
partitioning scheme in data warehouses is based on time, where each partition
contains data about a particular time period, for instance, a year or a range
of months.
In the following sections, we study these techniques in detail.
7.2 Materialized Views
We know that a view is a derived relation defined in terms of base relations
or other views, by means of the CREATE VIEW statement in SQL. A view
is recomputed every time it is invoked. A materialized view, on the other
hand, is a view that is physically stored in the database. This improves
query performance, playing the role of a cache which can be directly accessed
without looking into the base relations. But this benefit has a counterpart.
When the base relations are updated, the materialized views derived from
Search WWH ::




Custom Search