Databases Reference
In-Depth Information
The Load function will treat the Update row as an update to a previously
existing row. The Load function will treat the Insert row as a new row to be
inserted. The net effect will be a row for the next time frame that indicates
that the entity attributes no longer exist. Even that row, the row that shows
the nonexistence of the Entity Key, will have its own Instance Key.
cascading Instance keys
Chapter 11 explained the concept of Cascading Instance Keys. Dimension
tables relate to other dimension tables via a foreign key/primary key rela-
tion. A foreign key that is embedded in a first dimension table will directly
join to the primary key of a s econd dimension table. FigureĀ  11.4 and
FigureĀ 11.5 display this design.
Typically the dimension tables that do not have a foreign key to another
dimension table are those dimension tables that are at the top of their hier-
archies and those dimension tables that are lookup tables. Lookup tables
provide a textual description of a cryptic code or indicator value. The vast
majority of dimension tables in a data warehouse will reference another
dimension table. The Entity Key and Instance Key of a l ookup table,
which is referenced by a dimension table, will be embedded in the dimen-
sion table. The Entity Key and Instance Key of a hierarchical dimension
table, which is referenced by a lower hierarchical dimension table, will be
embedded in the lower hierarchical table. This is true for both Type 1 and
Type 2 time variant dimension tables.
For the purposes of dimension CDC this foreign key/primary key
aspect of this Time Variant Solution Design is very important. Cascading
Instance Keys dictate the sequence of the ETL jobs for dimension tables.
A dimension table can reference the Instance Key of a second dimension
table only after the Instance Key for the second dimension table has been
generated. hat means the ETL jobs cannot be run at random. Instead,
the ETL job for a dimension table can only run after the ETL jobs for all
tables referenced by that first table have run, including the generation of
Instance Keys. For example, if Table A references Table B, and Table B
references Table C, then the sequence of ETL jobs would populate Table C,
then Table B, and finally Table A. In the case of a lookup table, the ETL job
for a lookup table must complete before the ETL jobs for dimension tables
that reference that lookup table can run.
That is why Cascading Instance Keys dictate the sequence of ETL jobs.
An ETL job flow will typically begin with lookup tables and top-level
Search WWH ::




Custom Search