Database Reference
In-Depth Information
different interpretations as it always takes time
to collect and process data before it is delivered
to business users.
Take the prototype architecture in Figure 5
as an example, a practice of implementing near-
real-time requirements is to add an operational
data store (ODS) at the data staging area such
that analysis and reporting applications can access
the data once they are delivered from operational
systems and generate report in short time. To
shorten the time of committing a record change
in an operational system into the ODS or the data
warehouse, the ETL processes in the data stag-
ing area can be directly linked to the databases
of the operation systems. Techniques such as
change data capture (CDC) and message queues
can be applied in the ETL processes in order to
provide a fast reflection of record change in the
operational system.
Although different technologies can be applied
to enable the near-real-time loading of changed
data into the data warehouses and ODSs, the
whole data warehouse architecture has to face
the following challenges:
deliver data before the change is added to
the corresponding entity in the data ware-
house model.
The more frequent loading of data with
smaller amount of changes also requires
that related metadata information needs to
be updated at the same frequency. As the
data warehouse is a multi-layered struc-
ture, a record change at an operational sys-
tem naturally causes a ripple effect on all
the related metadata at different layers of
the data warehouse architecture. The qual-
ity of the metadata has to be maintained.
Master data Management with
data Warehouse Architecture
Master data management (MDM) is an enterprise
data management practice to actively manage and
control enterprise-wide master data. Compared to
data warehouses, MDM covers a broader scope
by unifying critical datasets across multiple IT
systems. There are very well-understood and easily
identified master-data items, such as “customer”
and “product” information. Data warehouses are
not directly linked to an MDM solution if the solu-
tion is only focused on the real-time integration
of certain master data among several operational
systems. In the cases where an MDM solution is
put at the downstream of an enterprise's data flow,
data warehouses play a key role as the trusted
source of master data.
MDM brings various benefits such as enhance-
ment of data quality, preservation of data integrity
and consistency, and reduced time-to-market on
implementing business requirements. We list the
following trends when MDM is applied over data
warehouse architectures.
First, according to the prototype architecture
in Figure 5, data quality processes are applied
only at the data staging area. As a data warehous-
ing practices, data cleansing and standardization
operations must be applied to the data warehouse
data flow only once. In the context of MDM, data
Compared to the typical ways of loading
large volumes of data in a bigger time pe-
riod (say, per 12 hours), the near real time
loading of data normally brings a good
amount of execution of loading jobs with
very small amount of changes to the data
warehouse at each job. Such change of job
execution profiles requires a sufficient con-
figuration and scalability planning of the
data warehouse platform.
In an integrated
data warehouse where the
integrity of the data model is essential, the
more frequent data loading brings more
challenges to the modeling practices. In a
case when an entity in the data model is
updated based on data from records of mul-
tiple operational systems, a record change
from one of the operational systems may
require the other operational systems to
Search WWH ::




Custom Search