Databases Reference
In-Depth Information
Once the number of rows in the view is known or estimated, the total storage
requirement for the view is usually pretty obvious within a factor of 2 by multiplying
by the average row width, which is easy to estimate by looking at the table definition.
If the count is small, like 32 in our example, it's somewhat irrelevant, but view sizes
can get very large, running into many gigabytes of storage and millions of rows. A
good database design should never spend more than 10 to 20% of total storage on
materialized views.
10.1.3 Counting for Multidimensional Clustering Design
In Chapter 8 we discussed the fact that space waste in a multidimensional clustering
(MDC) table can be estimated by assuming each cell contains a single partially filled
block at the end of its block list. The space waste is then:
10.1
where
η cells is the number of unique cells in the resulting MDC table, P % is the average
percentage of each storage block left empty per cell, and is the block size. We recommend
using 65% as a fairly conservative value for P % . The block size is typically a number of
pages, 32
4 KB pages being a typical block size. However, the space waste is going to be
driven heavily by the number of cells
×
η cells , which grows exponentially with the number
of columns in the MDC if the columns are poorly correlated. As space waste grows,
many of the benefits of clustering can be undone because too much I/O ends up being
done on largely empty pages. A good rule of thumb is to constrain the space waste to less
than 10% of the total non-MDC table size. Most expert MDC designers pick MDC
designs with less than 5% growth.
η cells can be counted using methods similar to the
index counting. For example, to count the number of distinct MDC cells for an MDC
table clustered on COL1, COL2, and COL3, use the following:
SELECT COUNT(*) FROM (SELECT DISTINCT COL1, COL2, COL3
FROM MY_FAV_TABLE) AS NUM_MDC_CELLS;
The size of the table and the block size for the table are usually readily available in
the system catalogs, making it possible to complete the calculation and determine the
MDC growth without going through a very expensive trial-and-error process to actually
materialize the MDC table to evaluate its size.
 
Search WWH ::




Custom Search