Databases Reference
In-Depth Information
2 Cleanse and validate data and load a central data warehouse.
3 Populate a data mart or dimensional model that provides collections of data from
across the enterprise.
Each step of the enterprise data model is implemented by multiple jobs in SAS Data
Integration Studio. Each job in each step can be scheduled to run at the time or event
that best fits your business needs and network performance requirements.
Data Warehousing with SAS Data Integration Studio
Developing an Enterprise Model
SAS Data Integration Studio helps you build dimensional data from across your
enterprise in three steps:
￿ Extract source data into a staging area (see “Step 1: Extract and Denormalize
Source Data” on page 40).
￿ Cleanse extracted data and populate a central data warehouse (see “Step 2:
Cleanse, Validate, and Load Data” on page 40).
￿ Create dimensional data that reflects important business needs (see “Step 3:
Create Data Marts or Dimensional Data” on page 41).
The three-step enterprise model represents best practices for large enterprises.
Smaller models can be developed from the enterprise model. For example, you can
easily create one job in SAS Data Integration Studio that extracts, transforms, and
loads data for a specific purpose.
Step 1: Extract and Denormalize Source Data
The extraction step consists of a series of SAS Data Integration Studio jobs that
capture data from across your enterprise for storage in a staging area. SAS data access
capabilities in the jobs enable you to extract data without changing your existing
systems.
The extraction jobs denormalize enterprise data for central storage. Normalized data
(many tables, few connections) is efficient for data collection. Denormalized data (few
tables, more connections) is more efficient for a central data warehouse, where
efficiency is needed for the population of data marts.
Step 2: Cleanse, Validate, and Load Data
After loading the staging area, a second set of SAS Data Integration Studio jobs
cleanse the data in the staging area, validate the data prior to loading, and load the
data into the data warehouse.
Data quality jobs remove redundancies, deal with missing data, and standardize
inconsistent data. They transform data as needed so that the data fits the data model.
For more information about available data cleansing capabilities, see the SAS Data
Quality Server: Reference .
Data validation ensures that the data meets established standards of integrity. Tests
show that the data is fully denormalized and cleansed, and that primary keys, user
keys, and foreign keys are correctly assigned.
 
Search WWH ::




Custom Search