Databases Reference
In-Depth Information
one needs to understand the nuances of how to combine the strengths of the technologies to create a
sustainable platform.
A key critical success factor in the design approach for the next-generation data warehouse archi-
tecture is a clearly documented and concise user requirement. With the appropriate user specification
on the data and the associated processing requirements and outcomes, a program can be developed
toward the implementation of the solution.
The next section's focus and discussion is on the integration strategy and architecture. There are
two primary portions of the integration architecture: data integration and architecture, and the physi-
cal implementation architecture.
Integration strategies
Data integration refers to combining data from different source systems for usage by business users to
study different behaviors of the business and its customers. In the early days of data integration, the data
was limited to transactional systems and their applications. The limited data set provided the basis for
creating decision support platforms that were used as analytic guides for making business decisions.
The growth of the volume of data and the data types over the last three decades, along with the
advent of data warehousing, coupled with the advances in infrastructure and technologies to support
the analysis and storage requirements for data, have changed the landscape of data integration forever.
Traditional data integration techniques have been focused on ETL, ELT, CDC, and EAI types of
architecture and associated programming models. In the world of Big Data, however, these techniques
will need to either be modified to suit the size and processing complexity demands, including the for-
mats of data that need to be processed. Big Data processing needs to be implemented as a two-step pro-
cess. The first step is a data-driven architecture that includes analysis and design of data processing. The
second step is the physical architecture implementation, which is discussed in the following sections.
Data-driven integration
In this technique of building the next-generation data warehouse, all the data within the enterprise are
categorized according to the data type, and depending on the nature of the data and its associated pro-
cessing requirements, the data processing is completed using business rules encapsulated in process-
ing logic and integrated into a series of program flows incorporating enterprise metadata, MDM, and
semantic technologies like taxonomies.
Figure 10.3 shows the inbound data processing of different categories of data. This model seg-
ments each data type based on the format and structure of the data, and then processes the appropriate
layers of processing rules within the ETL, ELT, CDC, or text processing techniques. Let us analyze
the data integration architecture and its benefits.
Data classification
As shown in Figure 10.3 , there are broad classifications of data:
Transactional data —the classic OLTP data belongs to this segment.
Web application data —the data from web applications that are developed by the organization can
be added to this category. This data includes clickstream data, web commerce data, and customer
relationship and call center chat data.
 
Search WWH ::




Custom Search