Database Reference
In-Depth Information
has a distributing attribute, an additional attribute is added to the
table to store this information.
In the above rules, surrogate keys are generated for each dimension level
in a data warehouse. The main reason for this is to provide independence
from the keys of the underlying source systems because such keys can change
across time. Another advantage of this solution is that surrogate keys are
usually represented as integers in order to increase eciency, whereas keys
from source systems may be representedinlessecientdatatypessuchas
strings. Nevertheless, the keys coming from the source systems should also
be kept in the dimensions to be able to match data from sources with data
in the warehouse. Obviously, an alternative solution is to reuse the keys from
the source systems in the data warehouse.
Notice that a fact table obtained by the mapping rules above will contain
the surrogate key of each level related to the fact with a one-to-many
relationship, one for each role that the level is playing. The key of the table is
composed of the surrogate keys of all the participating levels. Alternatively,
if a surrogate key is added to the fact table, the combination of the surrogate
keys of all the participating levels becomes an alternate key.
As we will see in Sect. 5.5 , more specialized rules are needed for mapping
the various kinds of hierarchies that we studied in Chap. 4 .
Applying the above rules to the Northwind conceptual data cube given
in Fig. 4.2 yields the tables shown in Fig. 5.4 .The Sales table includes eight
foreign keys, that is, one for each level related to the fact with a one-to-
many relationship. Recall from Chap. 4 that in role-playing dimensions ,
a dimension plays several roles. This is the case for the dimension Time
where, in the relational model, each role will be represented by a foreign
key. Thus, OrderDateKey , DueDateKey ,and ShippedDateKey are foreign keys
to the Time dimension table in Fig. 5.4 . Note also that dimension Order is
related to the fact with a one-to-one relationship. Therefore, the attributes
of the dimension are included as part of the fact table. For this reason, such
a dimension is called a fact (or degenerate ) dimension . The fact table
also contains five attributes representing the measures: UnitPrice , Quantity ,
Discount , SalesAmount ,and Freight . Finally, note that the many-to-many
parent-child relationship between Employee and Territory is mapped to the
table Territories , containing two foreign keys.
With respect to keys, in the Northwind data warehouse of Fig. 5.4 ,we
have illustrated the two possibilities for defining the keys of dimension levels,
namely, generating surrogate keys and keeping the database key as data
warehouse key. For example, Customer has a surrogate key CustomerKey and
a database key CustomerID . On the other hand, SupplierKey in Supplier is a
database key. The choice of one among these two solutions is addressed in
the ETL process that we will see in Chap. 8 .
Search WWH ::




Custom Search