Database Reference
In-Depth Information
if not all multidimensional elements can be fed with data from the source
systems, a new iteration with the users to modify their requirements
according to the availability of data is required. As a result, a new schema
should be developed and presented to the users for acceptance. The changes
to the schema may require modification of existing mappings.
10.4.2 Analysis-Driven Conceptual Design for the
Northwind Case Study
Develop Initial Schema
Based on the users' requirements, we developed the initial conceptual diagram
shown in Fig. 10.7 . This diagram was presented in Chap. 4 ,andwerepeatit
here to facilitate the presentation.
As described in the requirements phase, the main focus of analysis pertains
to sales amount figures. This is represented at the conceptual level by the
Sales fact in Fig. 10.7 . Given that the source data are organized into orders,
we need to transform order data into sales facts during the ETL process. For
example, the schema in Fig. 10.7 includes the measures Quantity , UnitPrice ,
Discount , SalesAmount , Freight ,and NetAmount . The first three measures are
obtained directly from the sources, while the others must be computed during
the ETL process. SalesAmount will be computed from Quantity , UnitPrice ,
and Discount . On the other hand, since in the operational database Freight
is associated with a complete order rather than with an order line, in the
data warehouse it must be distributed proportionally across the articles in
the corresponding order. Finally, NetAmount is a derived measure, computed
over the data cube. We remark the difference between a measure that is
computed during the ETL process ( SalesAmount ) and a derived attribute,
which is computed from the data cube ( NetAmount ). In addition, we specify
the aggregate function to be applied to each measure. For example, average
is applied to the measures UnitPrice and Discount . Note that the measures
Freight and NetAmount are not included in Table 10.1 since they do not
follow from Queries 1a-4c, which only represent a portion of the actual set
of queries.
The Sales fact is defined between the Product , Supplier , Customer ,
Employee ,and Time dimensions. Since the orders are associated with
different time instants, the Time level participates in the Sales fact with the
roles OrderDate , DueDate ,and ShippedDate . According to the requirements
summarized in Table 10.1 ,the Time dimension contains four aggregation
levels, where most of the scenarios include aggregation over time. Dimension
Product is related to the parent level Category with a one-to-many cardinality,
defining a strict hierarchy. The level Product also contains specific information
about products.
Search WWH ::




Custom Search