Database Reference
In-Depth Information
When you need to analyze data from a database, however, an RDBMS and OLTP often suffer from severe
performance problems. For example, finding total sales by site and by month requires the joining of all the
rows in many tables; such processing takes a considerable number of database accesses and considerable
time to accomplish. Consequently, many organizations continue to use RDBMSs and OLTP for their normal
day-to-day processing or for operational purposes, but the organizations have turned to data warehouses for
the analysis of their data. The following definition for a data warehouse is credited to W. H. Inmon (Inmon,
W. H. Building the Data Warehouse. QED, 1990), who originally coined the phrase.
Definition:
A data warehouse is a subject-oriented, integrated, time-variant, nonvolatile collection of data in
support of management
s decision-making process.
Subject-oriented means that data is organized by entity rather than by the application that uses the data.
For example, Figure 9-12 shows the databases for typical operational applications such as inventory, order
entry, production, and accounts payable. When the data from these operational databases is loaded into a
data warehouse, it is transformed into subjects such as product, customer, vendor, and financial. Data about
products appears once in the warehouse even though it might appear in many files and databases in the
operational environment.
'
296
Operational applications
Inventory
database
Order entry
database
Data warehouse
Product
Vendor
Customer
Financial
Production
database
Accounts payable
database
FIGURE 9-12
Data warehouse architecture
NOTE
For the operational applications shown in Figure 9-12, large organizations use a variety of DBMSs and file-processing systems
that have been developed over a period of many years.
Integrated means that data is stored in one place in the data warehouse even though the data originates
from everywhere in the organization and from a variety of external sources. The data can come from recently
developed applications or from legacy systems developed many years ago.
Time-variant means that data in a data warehouse represents snapshots of data at various points in time
in the past, such as at the end of each month. This is unlike an operational application, which has data that
is accurate as of the moment. Data warehouses also retain historical data for long periods of time; that data is
summarized to specific time periods, such as daily, weekly, monthly, and annually.
Nonvolatile means that data is read-only. Data is loaded into a data warehouse periodically, but users
cannot update a data warehouse directly.
Search WWH ::




Custom Search