Databases Reference
In-Depth Information
Here is the query to obtain profitability information at the monthly level for 2006:
SELECT month, sum(cost), sum(sell), sum(profit)
FROM Calendar AS c, Profit_Fact AS p
WHERE year = 2006
AND c.date_id = p.date_id
GROUP BY month
ORDER BY month;
This is the query to obtain data at the state level by year:
SELECT country, st_prov, year, sum(cost), sum(sell),
sum(profit)
FROM Calendar AS cal, Profit_Fact AS p, Customer AS
cust
AND cal.date_id = p.date_id
AND cust.customer_id = p.customer_id
GROUP BY country, st_prov, year
ORDER BY country, st_prov, year;
There are many combinations of levels possible. We can enumerate the combina-
tions and illustrate the relationships in a product graph. Figure 5.7 is the product graph
for the star schema in Figure 5.6. Actually, the product graph would be three dimen-
sional, because there is also a Job dimension. However, to keep the diagram simple we
are leaving off the level that includes the job_id, except for the Profit_Fact table, which
is important as a data source. The Calendar and the Customer dimensions are orthogo-
nal. The levels along each dimension are arranged in order from the most specific at the
top to the most general at the bottom. Each node represents a view. For example, the
Profit_Fact table corresponds to the node labeled “job, date, customer.” Since the
Profit_Fact table is materialized, we have shaded its node gray.
Following a path from specific to general signifies aggregation. For example,
monthly customer data can be aggregated from the daily customer data. More gener-
ally, the view at any node can be computed by aggregating data from any materialized
ancestor. If aggregation is complete along a dimension, the level is described by the
word “all.” For example, the Profit_by_Customer table corresponds to the node
labeled “all, customer,” and the Profit_by_Invoice_Date table corresponds to the
node “date, all.” Both of these can be materialized from the Profit_Fact table, since it
is a materialized ancestor. The bottom node represents the grand totals of all the data
in the Profit_Fact table. Let's say we've decided to materialize the Profit_by_Customer
and the Profit_by_Invoice_Date views, as indicated in Figure 5.7. If we run a query
for the grand totals, which of the three materialized ancestor views is the best to use as
a data source? The smallest would be the best choice, since the smaller source leads to
Search WWH ::




Custom Search