Database Schema Variants for Mixed OLTP and OLAP - Benchmarking Transaction and Analytical Processing Systems

Databases Reference

In-Depth Information

the queries defined on top of them. The query changes will also be presented in the

following.

5.2.1

Document-Oriented Schema

In the first variant, header and item tables are joined, because header and item

information is mostly requested in combination. To create a schema variant based

on pre-joined tables, the typical queries taking place in a system (the CBTR queries

for the order-to-cash process in the case of this thesis) have to be taken into account

to identify entities to be joined. From a business perspective, the data about the line

items of a sales order is never queried without accessing the sales order header.

Even in contexts such as top n sold products the header information is necessary to

provide the time dimension, as these queries always reference a certain time frame,

for example last quarter. This schema is called “document-oriented”, as it reflects a

complete print-out document of, e.g., a sales order.

Figure 5.3 illustrates the document-oriented schema derived from the schema

introduced in Sect. 4.2.2 . From the eight transaction tables comprising header and

item data of sales orders, shipments, bills, and accounting documents, four tables are

obtained, which in each case pre-compute the join between the header and the item

table via the document identifier. As an example, the sales facts table is produced

from joining the sales header table to the sales item table using the primary key order

ID from the sales header table and its pendent in the sales item table for the equijoin

condition. Here, and also for the other three sets an inner join is sufficient in a sound

order-to-cash database to produce the new tables, as a header is invalid without line

items and a line item always belongs to a header. All other tables remain unchanged.

In addition to changing the schema, the queries have to be adapted. The changes

to the queries are classified as follows:

Join-A When replacing the join of tables in the FROM clause of a selection by a

pre-joined table that exactly represents the join in the statement, only the name

of the table in the FROM clause is replaced by the new fact table and the join

conditions used for the pre-computed join in the fact table are removed from the

WHERE clause of the statement. A typical example for this is the pre-computed

join of header and item tables.

Join-B1 For all selections and computations solely based on columns that have

been part of tables where tuples have been multiplied during the pre-join, a sub-

select with DISTINCT command is added to filter out the added redundancy. This

redundancy has been created when the tuples were duplicated for each of their

associated tuples, e.g., header tuples were copied for each of their associated line

item tuples in the header item pre-joined table. The selected distinct columns in

the sub-select match the selected columns in the main select with the addition of

the primary key columns of the former tables that have been multiplied and on

which the selection or aggregation is based (if they are not included already). This

Search WWH ::

Custom Search

Home