Database Reference
In-Depth Information
is much higher than it was 20 years ago. A reasonable request for a new index
should not be rejected intuitively. With current disks, an indexed volatile column
may become an issue only if the column is updated perhaps more than 10 times
a second ; such columns are not very common.
SYSTEMATIC INDEX DESIGN
The first attempts toward an index design method originate from the 1960s. At
that time, textbooks recommended a matrix for predicting how often each field
(column) is read and updated and how often the records (rows) containing these
fields are inserted and deleted. This led to a list of columns to be indexed. The
indexes were generally assumed to have only a single column, and the objective
was to minimize the number of disk input/outputs (I/Os) during peak time. It is
amazing that this approach is still being mentioned in recent topics, although a
few, somewhat more realistic writers, do admit that the matrix should only cover
the most common transactions.
This column activity matrix approach may explain the column-oriented think-
ing that can be found even in recent textbooks and database courses, such as
consider indexing columns with these properties and avoid indexing columns with
those properties .
In the 1980s, the column-oriented approach began to lose ground to a
response-oriented approach . Enlightened DBAs started to realize that the objec-
tive of indexing should be to make all database calls fast enough, given the
hardware capacity constraints. The pseudo-relational DBMS of IBM S/38 (later
AS/400, then the iSeries) was the vanguard of this attitude. It automatically built
a good index for each database call. This worked well with simple applications.
Today, many products propose indexes for each SQL call, but indexes are not
created automatically, apart from primary key indexes and, sometimes, foreign
key indexes.
As applications became more complex and databases much larger, the impor-
tance and complexity of index design became obvious. Ambitious projects were
undertaken to develop tools for automating the design process. The basic idea
was to collect a sample of production workload and then generate a set of index
candidates for the SELECT statements in the workload. Simple evaluation for-
mulas or a cost-based optimizer would then be used to decide which indexes
were the most valuable. This sort of product has become available over the last
few years but has spread rather slower than expected. Possible reasons for this
are discussed in Chapter 16.
Systematic index design consists of two processes as shown in Figure 1.2.
First, it is necessary to find the SELECTs that are, or will be, too slow with the
current indexes, at least with the worst input; for example, “the largest customer”
or “the oldest date”. Second, indexes have to be designed to make the slow
SELECTs fast enough without making other SQL calls noticeably slower. Neither
of these tasks is trivial.
Search WWH ::




Custom Search