Database Reference
In-Depth Information
bitmap index
RID
A
=0
=1
=2
=3
1
0
1000
2
1
0100
3
2
0010
4
2
0010
5
3
0001
6
3
0001
7
1
0100
8
3
0001
b 1
b 2
b 3
b 4
Figure 6.15 An illustration of the basic bitmap index for a column A that
can only take on four distinct values from 0 to 3. Note: RID
=
row identifiers.
are generally ecient for answering queries. In fact, certain compressed
bitmap indexes are known to have the theoretically optimal computational
complexity. 103 They are relatively compact compared with common imple-
mentations of B-Trees, and they scale well for high-dimensional data and
multidimensional queries. Because they do not require the data records to be
in any particular order, they can easily take on data with any organization to
improve the overall data processing task beyond the querying step.
In this section, we explain the key concept of bitmap index and review three
categories of techniques for improving bitmap indexes: compression, encoding,
and binning. We end this section with two case studies on using a particular
implementation of bitmap indexes called FastBit . *
6.6.1 The Basic Bitmap Index
Figure 6.15 shows a logical view of the basic bitmap index. Conceptually,
this index contains the same information as a B-tree 24 , 65 . The key difference
is that a B-tree would store a list of row identifiers (RIDs) for each distinct
value of column A , whereas a bitmap index represents the same information
as sequences of bits, which we call bitmaps . In this basic bitmap index, each
bitmap corresponds to a particular value of the column. A bit with value 1
indicates that a particular row has the value represented by the bitmap. What
is required here is a durable mapping from RIDs to positions in the bitmaps. 63
The mapping used by the first commercial implementation of bitmap
index 65 is as follows. Let m denote the maximum number of rows that can fit
on a page; assign m bits for each page in all bitmaps. The first record in a page
is represented by the first bit assigned for the page, the second record by the
second bit, and so on. If a page contains less than m records, then the unused
bitmaps in the bitmap are left as 0. An additional existence bitmap may be
* FastBit software is available from https://codeforge.lbl.gov/projects/fastbit/ .
Search WWH ::




Custom Search