Database Reference
In-Depth Information
ECOS is autonomic, and it exploits evolution path to automatically evolve the
storage structures, i.e., our approach for self-tuning is online.
Consider the L ORDERKEY column of the LINEITEM table as shown in
Table 1. Suppose, as a database designer, we design this table. According to our
application design, we select the L ORDERKEY column as a part of the primary
key. As we already discussed in Section 3, we have to customize each column
as either ordered read-optimized or unordered write-optimized. Therefore, we
customize the L ORDERKEY column as ordered read-optimized. At the initial
design time, we design according to the domain knowledge, our experiences, and
predictions. As a designer, it is dicult to guarantee, how much this column
grows, and how long it takes to reach that size. When we customize the column
as ordered read-optimized, it is internally initialized as a sorted array. Now for
the L ORDERKEY column, three initial rows of the sample evolution path of
Table7arerelevant.
As we mentioned in Section 3, ECOS limits the storage capacity for each
storage structure. Therefore, the initial sorted array has a certain data storage
capacity limit. For example, consider it as 4KB. As long as data is within the 4KB
limits, sorted array is the storage structure for the L ORDERKEY column, and
we gather the heredity information for the column, such as the number of Get(),
the number of Put(), the number of Delete(), the number of range Get() (for
range queries), the number of Get() for all records (for scan queries), etc. What
heredity information should be gathered may vary from one implementation to
another. Here, we simplify our discussion by assuming that a system can identify
using heredity information that the workload is either read-intensive or write-
intensive and the access to data is either ordered (range queries) or unordered
(point or scan queries).
The moment the storage limit of the sorted array is consumed, an event is
raised for notification. This event triggers all three initial mutation rules of Ta-
ble 7. Now heredity based selection identifies, which one of them to execute. We
suppose that for the L ORDERKEY column, the workload is read-intensive and
the data access is unordered, this scenario executes the first mutation rule of
Table 7, which evolves the existing sorted array into a sorted list. Now sorted
list is the new storage structure, and it is also constrained with the storage limit
according to the design principle of ECOS. As long as the L ORDERKEY col-
umn data is within the storage limit of the sorted list, heredity information is
gathered, and it is used for the next evolution.
It is observed from Table 1 that only half of the LINEITEM columns, i.e.,
eight out of sixteen with high data growth evolve during the first evolution. The
rest of the columns can be stored within an array (either heap array or sorted
array). Furthermore, only half of the columns with high data growth, i.e., four
out of eight, which are evolved during the first evolution evolve again during the
second evolution (i.e., L ORDERKEY, L COMMENT, L EXTENDEDPRICE,
and L PARTKEY). The final state of the table presented in Table 1 shows that
each column is using the appropriate storage structure (we assume for expla-
nation) according to the stored data size and observed workload. We can add
more parameters for evolution decisions, but we only used limited parameters
 
Search WWH ::




Custom Search