Database Reference
In-Depth Information
7.1 Physical Modeling of Data Warehouses
In this section, we give an overview of the three classic techniques for
improving data warehouse performance: materialized views, indexing, and
partitioning. Later in the chapter we study these techniques in detail.
As we studied in Chap. 2 ,a view in the relational model is just a query
that is stored in the database with an associated name and which can then
be used like a normal table. This query can involve base tables (i.e., tables
physically stored in the database) and/or other views. A materialized view
is a view that is physically stored in a database. Materialized views enhance
query performance by precalculating costly operations such as joins and
aggregations and storing the results in the database. In this way, queries that
only need to access materialized views will be executed faster. Obviously, the
increased query performance is achieved at the expense of storage space.
A typical problem of materialized views is updating since all modifi-
cations to the underlying base tables must be propagated into the view.
Whenever possible, updates to materialized views are performed in an
incremental way, avoiding to recalculate the whole view from scratch. This
implies capturing the modifications to the base tables and determining how
they influence the content of the view. Much research work has been done in
the area of view maintenance. We study the most classic ones in this chapter.
In a data warehouse, given that the number of aggregates grows exponen-
tially with the number of dimensions and hierarchies, normally not all possible
aggregations can be precalculated and materialized. Thus, an important
problem in designing a data warehouse is the selection of materialized
views . The goal is to select an appropriate set of views that minimizes the
total query response time and the cost of maintaining the selected views,
given a limited amount of resources such as storage space or materialization
time. Many algorithms have been designed for selection of materialized views,
and currently some commercial DBMSs provide tools that tune the selection
of materialized views on the basis of previous queries to the data warehouse.
Once the views to be materialized have been defined, the queries addressed
to a data warehouse must be rewritten in order to best exploit such views
to improve query response time. This process, known as query rewriting ,
tries to use the materialized views as much as possible, even if they only
partially fulfill the query conditions. Selecting the best rewriting for a
query is a complex process, in particular for queries involving aggregations.
Many algorithms have been proposed for query rewriting in the presence of
materialized views. These algorithms impose various restrictions on the given
query and the potential materialized views so that the rewriting can be done.
A drawback of the materialized view approach is that it requires one to
anticipate the queries to be materialized. However, data warehouse queries
are often ad hoc and cannot always be anticipated. As queries which are not
Search WWH ::




Custom Search