Conclusion - Benchmarking Transaction and Analytical Processing Systems

Databases Reference

In-Depth Information

One approach is to define the triggers and introduce conformance requirements that

every benchmark run has to adhere to. Another approach is to exchange the database

schema with one of a higher degree of normalization to avoid data anomalies.

During the course of this work, the idea to include database schema varia-

tion as another dimension besides data set scaling and workload mix surfaced.

Consequently, the tool chain was designed in such a modular way that database

schema, transactions, and queries can be adapted or extended easily. Furthermore,

the database schema at a higher degree of normalization and the adaptation of

queries has already been defined as part of the evaluation of the impact of database

schema optimizations and the benchmark has been run on top of this database

schema. Thus, the exchange of the database schema has been implemented and

could be applied immediately.

CBTR uses throughput and response time as its current metrics and, thus,

provides performance measurement results. Performance is not the only dimension

of interest, though, and more measures, e.g., in the context of energy consumption

like in the TPC benchmarks, have to be presented to achieve a more complete picture

of a system.

As shortly raised above, restriction of optimizations is a topic that has to be

elaborated concretely. If results of benchmark runs are supposed to be fair and

comparable, especially if the tests are undertaken by different companies or research

groups, rules are necessary that restrict the available optimizations and methods

are needed to control their adherence, like the detailed reports in existing standard

benchmarks.

A research challenge that is not covered by hybrid or mixed workload OLTP

and OLAP systems themselves is reporting across several transactional systems and

other or even external data sources. This is a feature of major importance in today's

analytical systems in large enterprises. Strategies, e.g., federated query processing,

might be introduced on top to simulate information extraction and integration from

multiple source systems.

CBTR and its tool chain developed as part of this thesis have been used already

in two industry projects to determine the overhead of virtualization in raw database

performance. In another industry project, CBTR is currently extended to simulate

and evaluate multi-tenancy mixed workload environments. Thus, the work in this

thesis has already provided a basis for evaluation projects in the industry and sparked

new work in the area of benchmarking.

Search WWH ::

Custom Search

Home