Database Reference
In-Depth Information
model and data/index storage structure through customization. This design de-
cision is according to the suggestion from the work of Chaudhuri and Weikum [6].
ECOS supports fine-grained customization at the table-level and column-level
according to the recommendations/results from [2,10,12]. We also identified the
need to autonomically change the existing data and index storage structure to
more appropriate ones with the changing data management needs according to
our previously published results in [18]. We named our solution Evolutionary
Column-Oriented Storage (ECOS), which is based on the existing Decomposed
Storage Model (DSM) [10]. It uses hierarchically-organized storage structures
for each column with an innovative evolution mechanism, which enables auto-
nomic selection of the most suitable storage structure along the hierarchy (as the
levels of a hierarchy increase). Furthermore, we present four possible variations
to standard 2-copy DSM to reduce its high storage requirement. We evaluated
ECOS empirically using the custom micro benchmark and our results show that
ECOS self-tunes the storage structure while maintaining the required perfor-
mance. Additionally, it also gives minor performance gains. Furthermore, we
propose a mechanism called evolution path to define the storage structure evo-
lution, which reduces the need for human intervention for long-term database
maintenance.
This paper is organized as follows. Section 2 defines the problem and justi-
fies the motivation for the proposed design. Section 3 explains the concepts of
ECOS and evolution path in detail. Section 4 introduces the prototype imple-
mentation and gives details of the empirical evaluation of the proposed concepts
using a custom micro benchmark. Section 5 outlines the related work. Section 6
concludes the paper with hints for the future work.
2 Problem Statement and Motivation
Specific storage structures have characteristics suitable for certain data sizes and
access patterns. As both of these aspects may change over the course of data
usage, there is no single storage solution that provides optimal performance in
every situation. Therefore, we propose an autonomic adjustment of the storage
structures. In this section, we explain the motivation for some critical design
decisions in ECOS. To explain the problem in detail, we take the LINEITEM
table of the TPC Benchmark TM H (TPC-H) [17] schema as an example. We
generated the benchmark data with the scale factor of one and gathered statistics
for the LINEITEM table as shown in Table 1.
Why column-oriented storage model? The column-oriented storage model
is derived from earlier work of DSM [10]. DSM is a transposed storage model [4]
that stores all values of the same attribute of the relational conceptual schema
relation together [10]. Copeland and Khoshafian in [10,20] concluded many ad-
vantages of DSM including simplicity (Copeland and Khoshafian related it to
RISC [16]), less user involvement, less performance tuning requirement, reli-
ability, increased physical data independence and availability, and support of
 
Search WWH ::




Custom Search