Database Reference
In-Depth Information
Several additional extensions to the basic problem formulation are discussed
in the references found at the end of the chapter.
9.2.3 Relationship with Materialized View Selection
The ideas discussed in this section have similarities with the work on mate-
rialized view selection that we discussed in Chapter 8. In fact, each materi-
alized subcube can be seen as a materialized view, and using a subcube to
answer a query corresponds to the classical problem of answering queries us-
ing materialized views. However, the problem described in this section relies
on additional restrictions compared with the generic problem of recommend-
ing views, and thus we are able to provide a greedy algorithm with quality
guarantees. Specifically:
Workload queries are templatized in a very specific way. There always
are 2 n
possible cubes to consider.
The workload can be seen as composed of queries that exactly match
one of the possible subcubes (modulo additional equality constraints).
The cost model assumes that no indexes are available and that the cost
of doing aggregation linearly depends on the size of the used subcube.
9.3 Multidimensional Clustering
Multidimensional clustering (MDC) is primarily motivated by the appearance
of large repositories of relational data coupled with the need for complex data
mining and business analytic processing. These workloads are characterized
by multidimensional analysis of compiled enterprise data and typically include
queries with group-by clauses, aggregation, and multidimensional range selec-
tions. Multidimensional clustering generalizes the idea of a clustered index
with the ability to physically cluster a table on multiple dimensions at the
same time. This is a powerful technique that offers significant performance
benefits in several online analytical processing (OLAP) and decision support
systems.
9.3.1 MDC Definition
An MDC table is created by specifying one or more columns as dimensions to
cluster the table rows. Every unique combination of dimension values forms
a logical cell , which is physically organized as a block of consecutive pages
on disk. The set of blocks that contain pages with data having a certain
value on a dimension column is called a slice . By definition, every page of
Search WWH ::




Custom Search