Database Reference
In-Depth Information
When you consider the application of incremental data loads in a data warehouse scen-
ario, there's no better example than the slowly changing dimension (SCD). The nature
of dimensional data is such that it often does require updates by way of manipulating
existing rows in the dimension table (SCD Type 1) or expiring the current record and
adding a new row for that value, thus preserving the history for that dimension attribute
(SCD Type 2).
Although the slowly changing dimension is certainly not the only data warehouse
structure to benefit from an incremental load, it is one of the most common species of
that animal. As such, we'll focus mostly on SCD structures for talking points around
incremental loads.
Incremental Loads of Fact Data
Some data warehouse scenarios require the changing of fact data after it has been
loaded to the data warehouse. This scenario is typically handled through a secondary
load with a negating entry and a delta record, but some designs require the original fact
record to be corrected.
In such cases, the same methodology used for SCD data may also apply to fact
data. However, pay careful consideration to performance when applying SCD methods
to fact data. Fact data is exponentially more voluminous than dimension data and typic-
ally involves millions, and sometimes billions, of records. Therefore, apply the SCD
pattern to fact data only if you must. If there's any flexibility at all in the data ware-
house (DW) design, use the delta record approach instead.
Incremental Loads in SSIS
Microsoft SQL Server, and SSIS specifically, have several tools and methodologies
available for managing incremental data loads. This section discusses design patterns
around the following vehicles:
• Native SSIS components (Lookup transform, conditional split, etc.)
• Slowly Changing Dimension Wizard
MERGE statement in T-SQL
• Change data capture (CDC)
Search WWH ::




Custom Search