Database Reference
In-Depth Information
Integrated For proper decision making, you need to pull all the relevant data from
the various applications. The data in the data warehouse come from several opera-
tional systems. Source data are in different databases, files, and data segments. These
are disparate applications. The operational platforms could be different. The
operating systems could be different. File layouts, character code representations,
field naming conventions—all of these could be different.
In addition to data from internal operational systems, for many enterprises, data
from outside sources are likely to be very important. Companies such as Metro Mail,
A. C. Nielsen, and IRI specialize in providing vital data on a regular basis. Your
data warehouse may need data from such sources. This is one more variation in the
mix of source data for a data warehouse.
Before the data from various disparate sources can be usefully stored in a data
warehouse, you have to remove the inconsistencies. You have to standardize the
various data elements; you have to make sure of the meanings of data names in
each source application. Before moving the data into the data warehouse, you have
to go through a process of transformation, consolidation, and integration of the
source data.
Time-Variant For an operational system, the stored data contain the current values.
In an accounts receivable system, the balance is the current outstanding balance in
the customer's account. In an order entry system, the status of an order is the current
status of the order. In a consumer loans application, the balance amount owed by
the customer is the current amount. Of course, we store some past transactions in
operational systems. But, essentially, operational systems reflect current information
because these systems support day-to-day current operations.
On the other hand, the data in the data warehouse are meant for analysis and
decision making. If a user is looking at the buying pattern of a specific customer,
the user needs data not only about the current purchase but on the past purchases
as well. When a user wants to find out the reason for the drop in sales in the North
East division, the user needs all the sales data for that division over a period extend-
ing back in time. When an analyst in a grocery chain wants to promote two or more
products together, that analyst wants sales of the selected products over a number
of past quarters.
A data warehouse, because of the very nature of its purpose, has to contain his-
torical data, not just current values. Data are stored as snapshots over past and
current periods. Every data structure in the data warehouse contains the time
element. You will find historical snapshots of the operational data in the data ware-
house. This aspect of the data warehouse is quite significant for both the design and
implementation phases.
For example, in a data warehouse containing units of sale, the quantity stored in
each file record or a table row relates to a specific time element. Depending on the
level of the details in the data warehouse, the sales quantity in a record may relate
to a specific date, a specific week, a specific month, or a specific quarter.
The time-variant nature of the data in a data warehouse allows for analysis of
the past, relates information to the present, and enables forecasts for the future.
Nonvolatile Data extracted from the various operational systems and pertinent
data obtained from outside sources are transformed, integrated, and stored in the
Search WWH ::




Custom Search