Database Reference
In-Depth Information
chapter 6
ETL Processing with SQL
The universe is transformation; our life is what our thoughts make it.
—Marcus Aurelius
The ETL process, and projects associated with it, involve extracting vital information from outside sources,
transforming data into clean, consistent, and usable data, and loading it into the data warehouse. This process
is vital to the success of your BI solution. It is what makes the difference between a professional, functional data
warehouse versus one that is messy, insufficient, and unusable. The ETL process is also one of the longest and
most challenging steps in developing a BI solution.
In this chapter, we explain how to perform the ETL tasks required for your BI solution. Our ultimate goal in
both this chapter and the next is to demonstrate a technique where you use a combination of SQL programming
and SQL Server Integration Service (SSIS) to create a professional ETL project that will be a cornerstone of your
BI solution. We cover common SQL programming techniques used to identify issues and provide resolutions
associated with the ETL process. And we show how code for these SQL techniques can be placed into views and
stored procedures to be used for ETL processing.
This chapter is a prelude to Chapter 7 where we delve into how SSIS and the SQL programming techniques
learned in this chapter are combined. Let's begin now by taking a look at the overall process.
Performing the ETL Programming
Figure 6-1 outlines the typical steps involved in creating an ETL process using a combination of SQL Server and
SSIS. Note that the process includes deciding between filling up a table completely with fresh data, loading it
incrementally (as explained in the following section), and updating any changes from the original source.
These steps are followed by locating the data to extract and examining its contents for validity, conformity,
and completeness.
When you have verified that the data available meets your needs, it is likely that you still may need to
manipulate it to some degree to it your destination tables. This manipulation can come in the form of renaming
columns or converting the original data types to their destination data types. Once all of these preparations have
been completed, you can load the data into your data warehouse tables and begin the process again for each data
warehouse table you need to fill.
 
Search WWH ::




Custom Search