Introduction to Physical Database Design - Physical Database Design

Databases Reference

In-Depth Information

This kind of index is extremely efficient for queries involving both a part number

(pnum) and shipping date (shipdate). For queries involving just one of these attributes,

it is less efficient because of its greater size and therefore longer search time.

When we want to improve the query time for a table of data, say for instance the

table we access via the nonunique index on ship dates, we could organize the database so

that equivalent ship dates are stored near each other (on disk), and ship dates that are

close to each other in value are stored near each other. This type index is called a clus-

tered index . Otherwise the index is known as a nonclustered index . There can only be one

clustered index per table because the physical organization of the table must be fixed.

When the physical database table is unordered, it can be organized for efficient

access using a hash table index often simply known as a hash index . This type of index is

most frequently based on a key that has unique values in the data records. The attribute

(key) value is passed through a function that maps it to a starting block address, known

as a bucket address . The table must be set up by inserting all the records according to the

hash function, and then using the same hash function to query the records later.

Another variation of indexing, a bitmap index , is commonly used for secondary

indexing with multiple attribute values, and for very large databases in data warehouses.

A bitmap index consists of a collection of bit vectors, with each bit vector corresponding

to a particular attribute value, and for each record in the table, the bit vector is a “1” if

that record has the designated bit vector value, and “0” if it does not. This is particularly

useful if an attribute is sparse, that is, it has very few possible values, like gender or

course grade. It would not work well for attributes like last name, job title, age, and so

on. Bit vectors can be stored and accessed very efficiently, especially if they are small

enough to be located in memory.

The analysis and design of indexes are discussed in detail in Chapters 2 and 4.

1.3.2 Materialized Views

When one or more tables are queried, the result can be stored in what is called a materi-

alized view . Normally, views in SQL are stored as definitions or templates, but material-

ized views are stored as tables in the database just like any other table. In data ware-

houses, materialized views are maintained as aggregates of data in the base tables. These

kinds of views are very useful to speed up queries of data that have been asked before

(and frequently), or queries based on aggregates of data that can build on materialized

views to answer the question instead of having to go back to the original data each time.

Potentially a great deal of query time savings can be realized if the proper set of materi-

alized views is stored. It is usually impossible to store all possible views because of stor-

age space limitations, so some means must be found to focus on the best set of views to

materialize. There is also a problem with updates—when base tables are updated, this

cascades into the materialized views, which are derived from the base tables. The prob-

lem of multiple updates makes the use of materialized views less efficient, and this must

Search WWH ::

Custom Search

Home