OPTIMIZING ASSERTED VERSIONING DATABASES - Managing Time in Relational Databases

Databases Reference

In-Depth Information

best predicate might be {circa_asr_flag IN ('Y', 'N')} when it is

uncertain if the version is currently asserted. With this predicate,

the index can support searches for past assertions as well as

searches for current ones. Otherwise, it will require a separate

index to support searches for past assertions.

Clustering

Clustering and partitioning often go together, depending on

the reason for partitioning and the way in which specific DBMSs

support it. Whether or not partitioning is used, choosing the best

clustering sequence can dramatically reduce I/O and improve

performance.

The general concept behind clustering is that as the database

is modified, the DBMS will attempt to keep the data on physical

pages in the same order as that specified in the clustering index.

But each DBMS does this a little differently. One DBMS will clus-

ter each time an insert or update is processed. Another will make

a valiant attempt to do that. A third will only cluster when the

table is reorganized. But regardless of the approach, the result

is to reduce physical I/O by locating data that is frequently

accessed together as physically close together as possible.

Early DBMSs only allowed one clustering index, but newer

releases often support multiple clustering sequences, sometimes

called indexed views or multi-dimensional clustering.

It is important to determine the most frequently used access

paths to the data. Often the most frequently used access paths

are ones based on one or more foreign keys. For asserted version

tables, currently asserted current versions are usually the most

frequently queried data.

Sometimes, the right combination of foreign keys can provide

good clustering for more than one access path. For example,

suppose that a policy table has two low cardinality TFKs, prod-

uct type and market segment, and that each TFK value has

thousands of related policies. 3 We might then create this cluster-

ing index:

{circa_asr_flag, product_type_oid, market_segment_oid,

eff_end_dt, policy_oid}

The circa flag would cluster most of the currently asserted

rows together, keeping them physically co-located under the

lower cardinality columns. Clustering would continue based on

3 Low cardinality means that there are fewer distinct values for the field in the table

which results in more rows having a single value.

Search WWH ::

Custom Search

Home