Emerging Database Systems in Support of Scientific Data - Scientific Data Management

Database Reference

In-Depth Information

By using version management instead of record-wise locking techniques

(which becomes possible largely because of the typically much smaller

number of concurrent updates), each query would see a consistent

database state in which no locks, or very few locks, ever occur. Also,

version management is what complex analytics users need to be able to

keep track of their many different analysis paths across the database.

It is possible (although probably often not necessary) to index every attribute

since searches greatly dominate over updates and because adding an in-

dex to an attribute requires only that attribute to be read, not the entire

row.

Data compression is likely to be profitable because the data belonging to one

attribute are highly likely to be homogeneous and even auto-correlated.

Adding or deleting attributes of a table is likely to be cheap since only rel-

evant data would be accessed. Updates of an attribute are likely to be

relatively cheap because no irrelevant attribute values need to be read

or written.

Similar observations were made much earlier, 3 but were for a long time con-

sidered irrelevant in mainstream database research. In fact, these alternative

design principles have only comparatively recently received serious attention.

This renewed interest is at least partly due to the fact that today, many very

large databases are actually used for data warehousing, decision support, or

business or security intelligence applications, areas where similar character-

istics apply as those claimed above. In Svensson, 3

the following additional

observations were made:

In scientific data analysis, the number of simultaneous users is typically

much smaller than in large-scale commercial applications; but on the

other hand the users tend to put more complex, usually unanticipated,

queries to the system.

A more automatic response to complex user requests is required from the sys-

tem components, since in scientific applications no systems or database

specialists are usually available.

A system should be transportable to many computer models, including

medium-sized computers, because in scientific data analysis applications

many users prefer to work with an in-house computer dedicated to the

acquisition and analysis of data.

One of the first systems to be developed based on the above principles was

the system Cantor . 4 - 7 The Cantor project pioneered the analysis and coor-

dinated application of many of the above-mentioned techniques and concepts

in relational systems.

Search WWH ::

Custom Search

Home