Foundational data architecture patterns - Making Sense of NoSQL

Databases Reference

In-Depth Information

Table 3.3

A comparison of OLTP and OLAP systems (continued)

Online transaction processing (OLTP)

Online analytical processing (OLAP)

Key structures

Tables with multiple levels of joins

Star or snowflake designs with a large

central fact table and dimension tables

to categorize facts. Aggregate struc-

tures with summary data are pre-

computed.

Typical criteria for

success

Handles many concurrent users con-

stantly making changes without any

bottlenecks

Analysts can easily generate new

reports on millions of records, quickly

get key insights into trends, and spot

new business opportunities.

In this chapter, we've focused on general-purpose transactional database systems that

interact in a real-time environment, on an event-by-event basis. These real-time sys-

tems are designed to store and protect records of events such as sales transactions,

button-clicks on a web page, and transfers of funds between accounts. The class of sys-

tems we turn to now isn't concerned with button-clicks, but rather with analyzing past

events and drawing conclusions based on that information.

3.5.1

How data flows from operational systems to analytical systems

OLAP systems, frequently used in data warehouse/business intelligence ( DW / BI )

applications, aren't concerned with new data, but rather focus on the rapid analysis of

events in the past to make predictions about future events.

In OLAP systems, data flows from real-time operational systems into downstream

analytical systems as a way to separate daily transactions from the job of doing analysis

on historical data. This separation of concerns is important when designing NoSQL

systems, as the requirements of operational systems are dramatically different than the

requirements of analytical systems.

BI systems evolved because running summary reports on production databases

while traversing millions of rows of information was inefficient and slowed production

systems during peak workloads. Running reports on a mirrored system was an option,

but the reports still took a long time to run and were inefficient from an employee

productivity perspective. Sometime in the '80s a new class of databases emerged, spe-

cifically designed to focus on rapid ad hoc analysis of data even if there were millions

or billions of rows. The pioneers in these systems came, not from web companies, but

from firms that needed to understand retail store sales patterns and predict what

items should be in the store and when.

Let's look at a data flow diagram of how this works. Figure 3.10 shows the typical

data flow and some of the names associated with different regions of the business

intelligence and data warehouse data flow.

Each region in this diagram is responsible for specific tasks. Data that's constantly

changing during daily operations is stored on the left side of the diagram inside

Making Sense of NoSQL

Search WWH ::

Custom Search

Home