Databases Reference
In-Depth Information
This kind of index is extremely efficient for queries involving both a part number
(pnum) and shipping date (shipdate). For queries involving just one of these attributes,
it is less efficient because of its greater size and therefore longer search time.
When we want to improve the query time for a table of data, say for instance the
table we access via the nonunique index on ship dates, we could organize the database so
that equivalent ship dates are stored near each other (on disk), and ship dates that are
close to each other in value are stored near each other. This type index is called a clus-
tered index . Otherwise the index is known as a nonclustered index . There can only be one
clustered index per table because the physical organization of the table must be fixed.
When the physical database table is unordered, it can be organized for efficient
access using a hash table index often simply known as a hash index . This type of index is
most frequently based on a key that has unique values in the data records. The attribute
(key) value is passed through a function that maps it to a starting block address, known
as a bucket address . The table must be set up by inserting all the records according to the
hash function, and then using the same hash function to query the records later.
Another variation of indexing, a bitmap index , is commonly used for secondary
indexing with multiple attribute values, and for very large databases in data warehouses.
A bitmap index consists of a collection of bit vectors, with each bit vector corresponding
to a particular attribute value, and for each record in the table, the bit vector is a “1” if
that record has the designated bit vector value, and “0” if it does not. This is particularly
useful if an attribute is sparse, that is, it has very few possible values, like gender or
course grade. It would not work well for attributes like last name, job title, age, and so
on. Bit vectors can be stored and accessed very efficiently, especially if they are small
enough to be located in memory.
The analysis and design of indexes are discussed in detail in Chapters 2 and 4.
1.3.2 Materialized Views
When one or more tables are queried, the result can be stored in what is called a materi-
alized view . Normally, views in SQL are stored as definitions or templates, but material-
ized views are stored as tables in the database just like any other table. In data ware-
houses, materialized views are maintained as aggregates of data in the base tables. These
kinds of views are very useful to speed up queries of data that have been asked before
(and frequently), or queries based on aggregates of data that can build on materialized
views to answer the question instead of having to go back to the original data each time.
Potentially a great deal of query time savings can be realized if the proper set of materi-
alized views is stored. It is usually impossible to store all possible views because of stor-
age space limitations, so some means must be found to focus on the best set of views to
materialize. There is also a problem with updates—when base tables are updated, this
cascades into the materialized views, which are derived from the base tables. The prob-
lem of multiple updates makes the use of materialized views less efficient, and this must
Search WWH ::




Custom Search