Database Reference
In-Depth Information
values
p2
,
p4
,and
p6
. We then access the bitmap vectors with these labels
in Fig.
7.13
f, thus performing a join between
Product
(Fig.
7.13
a) and
Sales
.
Only the vectors labeled
p2
and
p4
match the search since there is no fact
record for product
p6
. The third, fourth, and sixth rows in the fact table are
the answer since they are the only ones with a '1' in the corresponding vectors
in Fig.
7.13
f. We then obtain the key values for the
CustomerKey
(
c2
and
c3
)
using the bitmap index in Fig.
7.13
e. With these values we search in the B
+
-
tree index over the keys in tables
Product
and
Customer
to find the names of
the products and the customer satisfying the query condition. Note that this
performs the join between the dimensions and the fact table. As we can see
in Figs.
7.10
aand
7.13
c, the records correspond to the names
cust2
,
cust3
,
prod2
,and
prod4
, respectively. Finally, the query answer is (
cust2
,
prod2
,
200
)
and (
cust3
,
prod4
,
100
).
Note that the last join with
Customer
would not be needed if the query
would have been of the following form:
SELECT S.CustomerKey, P.ProductKey, SUM(SalesAmount)
FROM Sales S, Product P
WHERE S.ProductKey = P.ProductKey AND P.Discontinued =
'
Yes
'
GROUP BY S.CustomerKey, P.ProductKey
The query above only mentions attributes in the fact table
Sales
.Thus,the
only join that needs to be performed is the one between
Product
and
Sales
.
We illustrate now the evaluation of star queries using bitmap join indexes.
We have seen that the main idea is to create a bitmap index over a fact
table using an attribute belonging to a dimension table, precomputing the
join between both tables and building a bitmap index over the latter.
Figure
7.13
g shows the bitmap join index between
Sales
and
Product
over the attribute
Discontinued
. Finding the facts corresponding to sales
of discontinued products, as required by the query under study, is now
straightforward: we just need to find the vector labeled '
Yes
', and look for the
bits set to '1'. During query evaluation, this avoids the first step described in
the previous section, when evaluating the query with bitmap indexes. This
is done at the expense of the cost of (off-line) precomputation.
Note that this strategy can reduce dramatically the evaluation cost if in
the
SELECT
clause there are no dimension attributes, and thus, we do not
need to join back with the dimensions using the B
+
-tree as explained above.
Thus, the answer for the alternative query above would just require a simple
scan of the
Sales
table, in the worst case.
7.7 Data Warehouse Partitioning
In a database, partitioning or fragmentation divides a table into smaller
data sets (each one called a partition) to better support the management
and processing of very large volumes of data. Partitioning can be applied
to tables as well as to indexes. Further, a partitioned index can be
Search WWH ::
Custom Search