Database Reference
In-Depth Information
Chapter 3
Physical Database Design
Relational database systems provide physical data independence. That is,
no matter which index configuration constitutes the physical design of the
database, queries always return the same results. Of course, the performance
of evaluating queries over different physical designs (i.e., different index con-
figurations) can vary significantly. For that reason, together with the capabili-
ties of the execution engine and query optimizer (discussed in Chapter 2), the
physical design of a database determines how eciently a query is executed.
There is no such a thing as the optimal physical design for a database,
unless we state the problem in the context of a given workload. The reason
is that indexes are useful only if they speed up queries without significantly
slowing down updates. Good candidate indexes for some workloads might be
completely irrelevant—or even harmful—for others. Additionally, indexes are
redundant structures that consume disk storage. Since available storage is
always limited, deciding which indexes to include in the physical design of
a database becomes a dicult problem. The physical design problem can be
stated as follows:
Physical design problem: Given a workload W consisting of queries and
updates, and a storage bound B , obtain the index configuration C that fits in
B and results in queries in W executing as eciently as possible.
To further clarify this statement, we next discuss alternative interpretations
to the components of the problem statement. We also explain which interpre-
tations we follow in the rest of the topic and point to specific chapters that
explore other alternatives.
First there is the question of what constitutes a workload W and how it is
obtained. The most prevalent choice is to consider the workload W as a set
of SQL queries and updates, possibly assigning a weight (or importance) to
each query in W . Workloads can be generated after monitoring a production
database system for some amount of time, by performing static analysis of
database applications, or even after a careful manual design. We are agnostic
to the specific approach to obtain a workload W and require only that W is a
representative sample of the real-world workload that the database manage-
ment system (DBMS) would process. Other alternative representations for W
(and their implications) are explored in Chapter 10.
43
Search WWH ::




Custom Search