Time Variant Solution Definition - Data Warehouse Designs

Databases Reference

In-Depth Information

only means that the Time Variant Solution Design is the singular focus

of this chapter.

oNe roW At A tIme

The question is so simple, the answer so obvious, that it seems a t rick

question. When a dimension table has five million rows, and you want to

join to one and only one of those five million rows, what is the fastest and

most eἀ cient way to retrieve that one and only one row? The answer is

simple. Give each of those five million rows a unique identifier. Each row

has one and only one unique identifier. Each unique identifier has one

and only one row. It is a one-to-one relation: one row equals one unique

identifier, and one unique identifier equals one row. A unique identifier

can never be associated with any other row, and a row can never be asso-

ciated with any other unique identifier. If a row ceases to exist, the unique

identifier ceases to exist. If a u nique identifier ceases to exist, the row

ceases to exist.

Now, given that construct in a d imension table that has five million

rows, and you want to join to one and only one of those five million rows,

what is the fastest and most eἀ cient way to retrieve that one and only

one row? The answer is simple. Include a WHERE clause statement that

says “WHERE unique_identifier = 1 2345.” Such a W HERE clause will

return one and only one row. No additional logic is required. No superset

is required. The RDBMS does not bring five million rows into memory.

Instead, the RDBMS brings only one row into memory. The goal has been

achieved and we can all congratulate ourselves and call it a day…almost.

We don't query a data warehouse one table at a time. No, we query mul-

tiple tables simultaneously. When a transaction table joins to a dimension

table that has five million rows, and you want to join each individual trans-

action row to one and only one of those five million dimension rows, what

is the fastest and most eἀ cient way to join the transaction and dimension

tables such that each row of the transaction table joins to one and only one

row of the dimension table? The answer is simple, and the same as before.

Give each of those five million dimension rows a unique identifier. Embed

that unique identifier in each row of the transaction table. Then, when

each row of the transaction table is joined to the dimension table, each

individual row of the transaction table will join with one and only one

Search WWH ::

Custom Search

Home