Distributed Data Allocation - Physical Database Design

Databases Reference

In-Depth Information

ated system can use its optimizer to maximize the efficiency of the reallocated database.

This implies that the updating of multiple heterogeneous sites will have some sort of

transaction support, like two-phase commit. This type of support is currently available

in some systems.

In the next few sections we illustrate how data allocation decisions can be made

with a simple model of performance.

16.2

Distributed Database Allocation

The conditions under which data allocation strategies may operate are determined by

the system architecture and the available federated database system software. The four

basic data allocation approaches are

•

Centralized approach

•

Partitioned approach

•

Replicated data approach

•

Selective replication approach

In the centralized approach, all the data is located at a single site. The implementa-

tion of this approach is simple. However, the size of the database is limited by the avail-

ability of the disk storage at the central site. Furthermore, the database may become

unavailable from any of the remote sites when communication failures occur, and the

database system fails totally when the central site fails. This is clearly the least desirable

approach in terms of data accessibility and overall performance.

In the partitioned approach, the database is partitioned into its base tables, and

each table is assigned to a particular site, without replication. This strategy is only

appropriate when local secondary storage is limited compared to the database size.

The completely replicated data approach allocates a full copy of the database (all

tables) to each site in the network. This completely redundant allocation strategy is only

appropriate when reliability is extremely critical, disk space is abundant, and update

inefficiency can be tolerated.

The selective replication approach partitions the database into critical use and non-

critical use tables. Noncritical tables need only be stored once, while critical tables are

replicated as desired to meet the required level of availability and performance. In gen-

eral, this is the preferred approach since it balances data availability, query performance,

and update efficiency.

The cost/benefit of the replicated database allocation strategy can be estimated in

terms of storage cost, communication costs (query and update time), and data availabil-

ity. Figure 16.1 briefly illustrates the tradeoff by showing the data replication on the hor-

izontal axis and costs on the vertical axis. The following can be seen from Figure 16.1:

Search WWH ::

Custom Search

Home