Database Reference
In-Depth Information
they come in the vector. Let us analyze the following sequence of bits:
0000000111000000000011. We have two runs of lengths 7 and 10, respectively,
three '1's in between, and two '1's at the end. This vector can be trivially
represented as the sequence of integers 7,1,1,1,10,1,1. However, this encoding
can be ambiguous, since we may not be able to distinguish if a '1' is an actual
bit or the length of a run. Let us see how we can handle this problem. Let us
call j the number of bits needed to represent n , the length of a run. We can
represent the run as a sequence of j −
1 '1' bits, followed by a '0', followed by
n in binary format. In our example, the first run, 0000000, will be encoded as
110111, where the first two '1's correspond to the j −
1 part, '0' indicates the
component of the run, and the last three '1's are the number 7 (the length
of the run) in binary format.
Finally, the bitmap vector above is encoded as 1100111 111 11101010 1,
where the encoding is indicated in boldface and the actual bits of the vector
are indicated in normal font. Note that since we know the length of the array,
we could get rid of the trailing '1's to save even more space.
7.5.3 Join Indexes
It is a well-known fact that join is one of the most expensive database
operations. Join indexes are particularly ecient for join processing in
decision-support queries since they take advantage of the star schema design,
where, as we have seen, the fact table is related to the dimension tables by
foreign keys, and joins are typically performed on these foreign keys.
The main idea of join indexes consists in precomputing the join as shown
in Fig. 7.12 . Consider the dimension table Product and the fact table Sales
from the Northwind data warehouse. We can expect that many queries
require a join between both tables using the foreign key. Figure 7.12 adepicts
table Product , with an additional attribute RowIDProd ,andFig. 7.12 bshows
table Sales extended with an additional attribute RowIDSales . Figure 7.12 c
shows the corresponding join index, basically a table containing pointers to
the matching rows. This structure can be used to eciently answer queries
requiring a join between tables Product and Sales .
A particular case of join index is the bitmap join index . Suppose now that a
usual query asks for total sales of discontinued products. In this case, a bitmap
join index can be created on table Sales over the attribute Discontinued ,as
shown in Fig. 7.12 d. As can be seen, the sales pertaining to discontinued
products (products p2 and p4 )havea' 1 ' in the bit vector labeled ' Yes '.
At first sight, this may appear to be strange because attribute Discontinued
does not belong to Sales . Actually what happens is that the index points to
the tuples in Sales that store sales of discontinued products. This is done by
precomputing the join between both tables through the attribute ProductKey
and then creating a bitmap index on Sales for each possible value of the
Search WWH ::




Custom Search