Databases Reference
In-Depth Information
business process of profit analysis. There are typically many business processes to con-
sider when designing a database. Obviously, it is not possible to materialize all the views,
so we must pick a strategic subset of the views for materialization.
Early research on the problem of selecting materialized views utilized a constraint
on the number of views materialized [Harinarayan, Rajaraman, and Ullman 1996].
Today, it is still useful to constrain the number of materialized views for at least two rea-
sons: the time available for designing the database is a constraint and materializing more
views requires more design time. Also, materializing more views requires more update
processes. Not only does increasing the number of materialized views consume more
computer resources, it also requires more database administration. As the base tables
evolve with the company's needs, the materialized views and the update processes must
be adjusted. Set a limit on the number of views you are willing to design and maintain.
Focusing on views that answer frequent queries is a heuristic that reduces the com-
plexity of the design process to a more manageable level. Constructing a complete prod-
uct graph is a problem with exponential complexity relative to the number of dimen-
sions. However, a simplified lattice is easily constructed based on the frequent queries.
The lattice in Figure 5.8 illustrates a simple lattice structure. The pertinent base tables
are included at the top. The Profit_Fact view can be created from the base tables. The
other three views can be calculated from the Profit_Fact view.
Basically, the lattice in Figure 5.8 is a reduced version of the product graph in Fig-
ure 5.7. The views that are queried infrequently are not included, leading to a much
simpler diagram. We've added the base tables and the view schemas into the lattice, suc-
cinctly capturing the situation. Each node is tagged with the size in blocks, and the asso-
ciated query frequency when appropriate. This metadata is important when calculating
the relative benefits of various possible view materializations.
Notice the 4-byte invoice_date_id replaces the 8-byte invoice_date column used in
Figure 5.5. The number of blocks is recalculated accordingly. The Profit_Fact table cal-
culations are:
Average rows per block (buffer)
= floor((65,536 bytes/buffer)/(36 bytes/row)) = 1,820.
Number of buffers = ceiling(2,000,000 rows/(1,820 rows/buffer))
= 1,099.
The amount of disk space required to store the materialized views is an obvious
constraint. Kimball et al. [1998, p. 544] mention that as a rule of thumb in the context
of data warehousing, many database administrators (DBAs) limit the space increase
needed for aggregated materialized views to a factor of two or less. This may vary
depending on the purpose of the database, and the resources of the organization. If you
are hoping to improve performance without implementing a full-fledged data ware-
house, then you can set your disk space limit at a more modest level. If the views that
Search WWH ::




Custom Search