Database Reference
In-Depth Information
Data Warehouses and Data Marts
According to Figure 12-2, some BI systems read and process operational data directly from the
operational database. Although this is possible for simple reporting systems and small data-
bases, such direct reading of operational data is not feasible for more complex applications or
larger databases. Those larger applications usually process a separate database constructed
from an extract of the operational database.
Operational data are difficult to read for several reasons. For one, querying data for BI
applications can place a substantial burden on the DBMS and unacceptably slow the perfor-
mance of operational applications. Additionally, operational data have problems that limit
their use for BI applications. Further, the creation and maintenance of BI systems require
programs, facilities, and expertise that are not normally available from operations. Because of
these problems, many organizations have chosen to develop data warehouses and data marts
to support BI applications.
Components of a Data Warehouse
To overcome the problems just described, many organizations have created data ware-
houses , which are database systems that have data, programs, and personnel that specialize
in the preparation of data for BI processing. Data warehouse databases differ from operational
databases because the data warehouse data are frequently denormalized. Data warehouses
vary in scale and scope. They can be as simple as a sole employee processing a data extract on
a part-time basis or as complex as a department with dozens of employees maintaining librar-
ies of data and programs.
Figure 12-4 shows the components of a data warehouse. Data are read from operational
databases by the Extract, Transform, and Load (ETL) system . The ETL system then cleans
and prepares the data for BI processing. This can be a complex process.
First, the data may be problematic, which we will discuss in the next section. Second, data
may need to be changed or transformed for use in a data warehouse. For example, the opera-
tional systems may store data about countries using standard two-letter country codes, such as
US (United States) and CA (Canada). However, applications using the data warehouse may need
to use the country names in full. Thus, the data transformation {CountryCode CountryName}
will be needed before the data can be loaded into the data warehouse.
The ETL stores the extracted data in a data warehouse database using a data warehouse
DBMS, which can be different from the organization's operational DBMS. For example, an
organization might use Oracle Database for its operational processing but use SQL Server for
its data warehouse. Other organizations might use SQL Server for operational processing and
Figure 12-4
Components of a Data
Warehouse
Data
Warehouse
Metadata
Data
Warehouse
Database
Operational
Databases
ETL System
Business
Intelligence
Tools
Data Extraction/
Cleaning/
Preparation
Programs
Other
Internal
Data
Data Warehouse
DBMS
External
Data
BI Users
 
 
Search WWH ::




Custom Search