Databases Reference
In-Depth Information
The different approaches to designing a data warehouse should not be dis-
regarded. Inmon's approach mentioned above is one option. Another option is
Kimball's approach [ 109 ] to design a data warehouse as a collection of dimension-
ally modeled data marts. Yet, another approach is using independent data marts
to design a data warehouse. This approach, however, is seen as inappropriate in
the data warehouse community as ETL steps are repeated unnecessarily and they
lack cross-department analysis and communication capabilities [ 114 ]. Kimball's
approach achieves results quicker and simpler than Inmon's, but a common criticism
is that it lacks enterprise-wide focus. Jukic [ 114 ] describes the different approaches
and outcomes of Inmon's and Kimball's methodologies as a trade-off between
extensiveness and power versus quickness and simplicity. A detailed comparison
of the Inmon and Kimball approaches can be found in [ 21 ].
Another structure in the analytical landscape that is complementary to the data
warehouse according to Inmon [ 108 , Chap. 16] is the operational data store (ODS).
It is a subject-oriented, integrated, volatile, current-valued, detailed-only collection
of data to support a companies need of reporting on up-to-the-second, integrated,
operational data [ 100 ]. Subject-orientation and integration are similarities between
the data warehouse and the ODS. Differences lie in the freshness of data, the amount
of data that is kept in the ODS, and that data in the ODS is updated by overwriting
existing entries instead of adding another snapshot. Due to the updates, the ODS
contains no historical data. In the ODS only detailed data is kept, whereas a data
warehouse contains detailed and summary data. ODSs are categorized into different
classes according to the length of their update intervals affecting the freshness of
data they contain.
Figure 2.10 gives an example of how the different OLAP data stores can be
associated and shows possible flows of data between them. Sources of data can be
enterprise resource planning systems, files like spreadsheets, or external services to
name just a few. The ETL process between the data sources and the data warehouse
is not explicitly shown in this overview. A detailed comparison of data warehousing
methodologies including all of the three mentioned analytical structured is given
in [ 184 ].
Inmon [ 100 ] initially defined three classes of ODSs. Class I ODSs are kept
synchronized with the operational systems they retrieve data from, so that data is
available for reporting only seconds after it is inserted in the operational systems.
Class II ODSs are updated periodically every hour or in similar time intervals.
Updates to the operational systems are stored in an intermediate file, which is then
loaded into the ODS. Class III ODSs are subject to the same process, but the time
interval between data updates in the ODS can be 24 h or more. The business case
for class II and III ODSs is the most common [ 100 ]. Operational costs for class
I ODSs are much higher because of the immediate synchronization and business
cases justifying this class of ODS are rare [ 100 ]. A fourth ODS type (class IV)
was introduced later on. This class of ODS holds results of reports from the data
warehouse [ 105 ]. The rationale for the creation of this ODS class was that the data
warehouse did not provide responses in real-time. The Class IV ODS is able to do so
Search WWH ::




Custom Search