approach is needed. We then show that such a systematic but still manual ap-
proach quickly becomes infeasible. Finally, we formalize this claim by showing
that the physical design problem is in fact NP-complete.
3.1.1 Rules of Thumb in Physical Database Design
The problem of tuning the physical design of a database is very important in
production systems and heavily depends on the specific workload and data
distributions. Over time, a number of rules of thumb emerged providing some
form of general guidance and best practices to follow during effective physical
design sessions. Virtually every course in database systems or topic that spe-
cializes in query performance discusses such rules. For illustration purposes,
we next describe a few of these rules of thumb:
Index primary keys and most foreign keys. Since most joins are per-
formed between columns in primary and foreign keys, including such
columns in indexes can improve performance of complex queries.
Columns frequently referenced in WHERE clauses are good candidates for
indexes. Columns that are referenced in equality or inequality predicates
in WHERE clauses (especially those that are frequently referenced) can be
eciently fetched by appropriate indexes.
Avoid redundant indexes. Indexes defined over the same (or almost the
same) columns are rarely beneficial, as they do not provide additional
utility over a single index but at the same time use additional space and
need to be updated.
Use index columns with caution when they are frequently updated. When
database columns are updated, all indexes defined over such columns
need to be maintained. A column that is frequently updated might trans-
form an otherwise useful index into a performance bottleneck.
Consider covering indexes for critical queries. Covering indexes (i.e.,
those indexes that contain all columns required in a query) are very good
at improving performance, but at the same time they can be large and
useful only for a few queries. Queries that are critical or very frequent,
however, might benefit from covering indexes.
Avoid indexes on small tables. Indexes on tables that fit in one or two
pages often do not provide additional performance benefits but increase
the number of structures to maintain and administer.
We can argue about specific rules or their conditions of applicability, but
in general these best practices make sense and can be justified while taken
in isolation. At the same time, they are mostly qualitative assertions and
principles that do not easily lead to detailed and well-defined “recipes.” It
is therefore dicult to know exactly when and how to combine these rules,
especially in the presence of multiple complex queries or storage constraints. In