Database Reference
In-Depth Information
method controls the index size are prominent in many research articles. Since
there is plenty of information on the index sizes, we have chosen to concentrate
on the query response time. Interested readers can obtain more information
on discussions of the index sizes in References 45 and 109.
6.6.3 Bitmap Encoding
Bitmap encoding techniques can be thought of as ways of manipulating the
bitmaps produced by the basic bitmap index to either reduce the number
of bitmaps in an index or reduce the number of bitmaps needed to answer
a query. For example, to answer a range query of the form “ A
3” in the
example given in Figure 6.15, one needs to OR the three bitmaps b 1 , b 2 , and
b 3 . If most of the queries involve only one-sided range conditions as in this
example, then it is possible to store C bitmaps that correspond to A
<
a i for
each of the C distinct values of A . We call C the column cardinality of A . Such
a bitmap index would have the same number of bitmaps as the basic bitmap
index, but can answer all one-sided range queries by reading one bitmap. This
is the range encoding proposed by Chan and Ioannidis. 21 The same authors
also proposed another encoding method called the interval encoding that uses
about half as many bitmaps as the basic bitmap index, but answers any range
queries with only two bitmaps. 22 The encoding used in the basic bitmap index
is commonly referred to as the equality encoding . Altogether, there are three
basic bitmap encoding methods: equality, range, and interval encodings.
The three basic bitmap encodings can be composed together to form two
types of composite encoding schemes: multicomponent encoding 21 , 22 and mul-
tilevel encoding. 86 , 108 The central idea of multicomponent encoding is to break
the key value corresponding to a bitmap into multiple components in the same
way an integer number is broken into multiple digits in a decimal represen-
tation. In general, each “digit” may use a different basis size. For an integer
attribute with values between 0 and 62, we can use two components of basis
sizes 7 and 9, and index each component separately. If the equality encoding
is used for both components, then we use 7 bitmaps for one component and
9 bitmaps for the other. Altogether we use 16 bitmaps, instead of 63 had
the equality encoding been used directly on the key values. It is easy to see
that using more components can reduce the number of bitmaps needed, which
may reduce the index size. To carry this to the extreme, we can make all base
sizes 2 and use the maximum number of components. This particular multi-
component encoding can be optimized to be the binary encoding, 100 which is
also known as the bit-sliced index. 64 , 66 This encoding produces the minimum
number of bitmaps, and the corresponding index size is the smallest without
compression. A number of authors have explored different strategies of using
this type of encoding. 20 , 112
To answer a query using a multicomponent index, all components are typ-
ically needed, therefore, the average query response time may increase with
the number of components. It is unclear how many components would offer
the best performance. A theoretical analysis concluded that two components
Search WWH ::




Custom Search