Selecting Materialized Views - Physical Database Design

Databases Reference

In-Depth Information

history. For example, if the Profit_Fact table becomes too large, data older than two

years could be purged, while the Profit_by_Invoice_Date table can hold five years'

worth of data and still be very small. The purging of detailed data while holding on to

summary data requires an incremental update strategy.

The time available for updates is another constraint to consider. Often, an update

window is set aside during off hours, when the materialized views are updated in a ded-

icated disk. Determine the amount of time available for the update window, if applica-

ble. You may determine that an update window is not possible. Perhaps your organiza-

tion operates 24 hours a day, and the system cannot be made unavailable for any update

window. Perhaps the users need real-time summary data, and can't use data that was

computed last night. Incremental updates can often address these problems.

Instead of computing a view from scratch, incremental updates keep the majority

of the rows of a materialized view intact, and add new results as appropriate. For exam-

ple, consider the Profit_by_Invoice_Date table. Let's say the invoices are processed in a

daily cycle. Information is gathered for the jobs that have shipped since the previous

invoice cycle. Each customer that has jobs in the current cycle has an invoice assigned.

The customer's jobs are placed on the invoice. After the invoices are verified, the

invoices are printed and sent to the customers, and the status of each invoice is updated

in the database. As the cycle completes, the sell value for each job is known. The costs

for each job are also known, since the Job_Costing table is updated in real time and

work completes on each job before it is invoiced. We can calculate the daily totals and

add a single row to the Profit_by_Invoice_Date table. Updating the materialized views

in real time is sometimes referred to as a trickle feed .

5.5 Examples: The Good, the Bad, and the Ugly

Materialized views and star schemas are good approaches for reducing disk I/O, result-

ing in quicker query responses. Figure 5.6 is an example of a star schema, illustrating the

dimensional design approach. Typically, there are more dimensions in a star schema, but

the example is sufficient for discussion. The fact table is a materialized view derived

from the base tables. The materialized view amounts to stored calculations that reduce

the amount of disk I/O. The dimension tables provide a means to group data by various

levels along a variety of dimensions. The dimension tables add flexibility, multiplying

the gains of the fact table over a larger family of queries. The star schema and the

dimensional design approach can be good for improving query responses.

A question naturally arises. If precomputing saves time, then why not precompute

the joins between the fact table and the dimension tables? Wouldn't this save even more

time? The schema shown in Figure 5.9 is a possible consolidation of the fact table with

the dimension tables from Figure 5.6. Now we can query just one table, and we still

have the same ability to group by levels that we had with the dimension tables.

Let's rewrite the second query from Section 5.3. The query is now simpler:

Search WWH ::

Custom Search

Home