Databases Reference
In-Depth Information
Thus evolved the first generation of OLTP applications. Around the same time in the 1970s, Edgar. F.
Codd published his paper on the relational model of systems for managing data. 1 The paper was piv-
otal in several ways:
It introduced for the first time a relationship-based approach to understanding data.
It introduced the first approach to modeling data.
It introduced the idea of abstracting the management and storage of data from the user.
It discussed the idea of isolating applications and data.
It discussed the idea of removing duplicates and reducing redundancy.
Codd's paper and the release of System R, the first experimental relational database, provided the
first glimpse of moving to a relational model of database systems. The subsequent emergence of mul-
tiple relational databases, such as Oracle RDB, Sybase, and SQL/DS, within a few years of the 1980s
were coupled with the first editions of SQL language. OLTP systems started emerging stronger on
the relational model; for the first time companies were presented with two-tier applications where
the graphical user interface (GUI) was powerful enough to model front-end needs and the underlying
data was completely encapsulated from the end user.
In the late 1970s and early 1980s, the first concepts of data warehousing emerged with the need to
store and analyze the data from the OLTP. The ability to gather transactions, products, services, and
locations over a period of time started providing interesting capabilities to companies that were never
there in the OLTP world, partially due to the design of the OLTP and due to the limitations with the
scalability of the infrastructure.
Traditional data warehousing, or data warehousing 1.0
In the early days of OLTP systems, there were multiple applications that were developed by companies
to solve different data needs. This was good from the company's perspective because systems processed
data quickly and returned results, but the downside was the results from two systems did not match. For
example, one system would report sales to be $5,000 for the day and another would report $35,000 for
the day, for the same data. Reconciliation of data across the systems proved to be a nightmare.
The definition of a data warehouse by Bill Inmon that is accepted as the standard by the industry
states that the data warehouse is a subject-oriented, nonvolatile, integrated, time-variant collection of
data in support of management's decision. 2
The first generation of data warehouses that we have built and continue to build are tightly tied
to the relational model and follow the principles of Codd's data rules. There are two parts to the data
warehouse in the design and architecture. The first part deals with the data architecture and process-
ing; per Codd's paper, it answers the data encapsulation from the user. The second part deals with the
database architecture, infrastructure, and system architecture. Let us take a quick overview of the data
architecture and the infrastructure of the data warehouse before we discuss the challenges and pitfalls
of traditional data warehousing.
1 Codd, E. F. (1970). A Relational Model of Data for Large Shared Data Banks. Communications of the ACM, 13 (6),
377-387. doi:10.1145/362384.362685.
2 http://www.inmoncif.com/home/
 
Search WWH ::




Custom Search