Databases Reference
In-Depth Information
Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data
(Wiley, 2004) from Ralph Kimball and The Kimball Group. A source for white papers about BI, manag-
ing BI projects, and more is TDWI.
It's important to note that data modeling in Excel and PowerPivot were developed by the SQL
Server product team in large part to confront the reality that Excel is being used by business users
to answer business questions that are to slow coming from traditional corporate BI. Data warehouse
and standard SSRS reports move too slowly and do not answer questions fast enough. The traditional
corporate BI approach proves too expensive and too rigid for power users trying to answer questions
quickly. Work must continue regardless of whether the corporate data warehouse has all of the infor-
mation. Getting to trusted data faster is the key to success.
The data warehouse
An ideal data source for loading data into a tabular model is a data warehouse and its data marts. In
many cases a data warehouse already exists. This will depend on several factors such as the size of
your company or the maturity level for business intelligence (discussed in Chapter 2). This optional
intermediary step of creating a data warehouse is not shown in the end-to-end diagram in Figure 3-6.
Notwithstanding, comprehending the process of preparing data in the form of a data warehouse is
worth knowing; otherwise, skip to the next step.
Preparing data in the form of a data warehouse can occur in two phases. First you must design and
create a data warehouse in SQL Server. (A number of topics have been written on best practices for
creating a data warehouse.)
A staging database is created. A staging database is an intermediate storage location used for
organizing, cleansing, and transforming data to prepare it for a data warehouse. A staging database
helps minimize the impact on the source systems during the ETL process. Basically, you want to get
in and out of the data warehouse quickly. If you put everything into a staging database and then if
a transformation step fails, you can restart with the data in stages instead of having to go back and
touch (impact) the source systems again. The staging database sits between the various sources and
the data warehouse.
After creating a data mart or data warehouse, you can use stored procedures or SSIS to create a
repeatable ETL process for getting various sources of data into your data warehouse databases. SSIS is
a rich tool that performs useful operations such as making the data conform to specifications so that
you can use it in applications, to create cubes, or to connect directly to it from reports.
Let's look at some of the components of a data warehouse.
Search WWH ::




Custom Search