Database Reference
In-Depth Information
Formatting Your Code
Often the code you use to isolate the data is later used in the ETL process. Because of this, you may want to take
time to make your code look professional. One way of making your code more professional is to format the ETL
code. For example, although using a star symbol to indicate all columns implicitly is acceptable practice for ad
hoc queries, a better practice is to explicitly list columns individually as we have in Listing 6-5.
Listing 6-5. Explicitly Listing the Columns
Select
[title_id]
, [title]
, [type]
, [pub_id]
, [price]
, [advance]
, [royalty]
, [ytd_sales]
, [notes]
, [pubdate]
From [Pubs].[dbo].[Titles]
You may notice that we are using square brackets around column and table names. This is optional; however,
this convention is also considered a best practice.
One additional convention is identifying a table using its full name, or at least most of it. The standard parts
of a table's name are <ServerName>.<DatabaseName>.<SchemaName>.<TableName> . It is common to use the
last three parts of the fully qualified name, but you seldom use the server name part. Doing so indicates that you
want to access a table on a remote server. Although this can be advantageous, it necessitates that SQL Server be
configured to use linked servers, something that is not commonly done on a production server. Without a linked
server, including the server name as part of the full name of the table generates an error. You therefore use the
three-part name most of the time.
Identifying the Transformation Logic
Identifying which transformation is needed and then programming the transformation logic is the portion of
the ETL process that usually takes the longest. Let's take a look at an example to understand what is involved. In
Figure 6-2 , you see two tables: the Titles OLTP database table and the DimTitles OLAP data warehouse table. Let's
compare the two tables:
The DimTitles table has fewer columns.
The column names are different between the tables.
The data types are different on some columns.
There is now a surrogate key that can be used for foreign key references.
Nullability has been changed in many columns.
Some values have been cleansed and made more readable.
 
Search WWH ::




Custom Search