Databases Reference
In-Depth Information
There are significant issues in the data platforms in the current-state architecture within this
enterprise that prevent the deployment of solutions on incumbent technologies. The landscape of the
current-state architecture includes:
●
Multiple source systems and transactional databases totaling about 10 TB per year in data
volumes.
●
A large POS network across hundreds of locations.
●
Online web transactional databases driving about 7 TB of data per year.
●
Catalog and mail data totaling about 3 TB per year in unstructured formats.
●
Call center data across all lines of business totaling about 2 TB per year.
●
Three data warehouses each containing about 50 TB of data for four years of data.
●
Statistical and analytical databases each about 10 TB in summary data for four years of data.
The complexity of this environment also includes metadata databases, MDM systems, and refer-
ence databases that are used in processing the data throughout the system.
The current-state complaint points in processing these volumes of data include:
●
Data processing does not complete everyday across all the systems:
●
Too many sources to data warehouse extracts.
●
Too many processes for data transformation.
●
Too many repetitive business rules.
●
Too many data-quality exceptions.
●
Too many redundant copies of data across the data warehouses, datamarts, statistical databases,
and ODS.
●
Analytical queries do not complete processing.
●
Analytical cube refresh does not complete.
●
Drilldown and drill-across dimensions cannot be processed on more than two or three quarters
of data.
Figure 14.1
shows the conceptual architecture of the current-state platforms in the enterprise.
To satisfy the business drivers along with current-state performance issues, this enterprise formed
a SWAT team to set the strategy and direction for developing a flexible, elastic, and scalable future-
state architecture.
The future-state architecture for the enterprise data platform was developed with the following
goals and requirements:
●
Goals:
●
Align best-fit technology and applications.
●
Reduce overhead in data management.
●
Reduce cost and spending on incumbent technologies.
●
Implement governance processes for program and data management.
●
Requirements:
●
Ask any question about anything at any time.
●
Query data from social media, unstructured content, and web database on one interface.
●
Process clickstream and web activity data for near-real-time promotions for online shoppers.