Databases Reference
In-Depth Information
ated system can use its optimizer to maximize the efficiency of the reallocated database.
This implies that the updating of multiple heterogeneous sites will have some sort of
transaction support, like two-phase commit. This type of support is currently available
in some systems.
In the next few sections we illustrate how data allocation decisions can be made
with a simple model of performance.
16.2
Distributed Database Allocation
The conditions under which data allocation strategies may operate are determined by
the system architecture and the available federated database system software. The four
basic data allocation approaches are
Centralized approach
Partitioned approach
Replicated data approach
Selective replication approach
In the centralized approach, all the data is located at a single site. The implementa-
tion of this approach is simple. However, the size of the database is limited by the avail-
ability of the disk storage at the central site. Furthermore, the database may become
unavailable from any of the remote sites when communication failures occur, and the
database system fails totally when the central site fails. This is clearly the least desirable
approach in terms of data accessibility and overall performance.
In the partitioned approach, the database is partitioned into its base tables, and
each table is assigned to a particular site, without replication. This strategy is only
appropriate when local secondary storage is limited compared to the database size.
The completely replicated data approach allocates a full copy of the database (all
tables) to each site in the network. This completely redundant allocation strategy is only
appropriate when reliability is extremely critical, disk space is abundant, and update
inefficiency can be tolerated.
The selective replication approach partitions the database into critical use and non-
critical use tables. Noncritical tables need only be stored once, while critical tables are
replicated as desired to meet the required level of availability and performance. In gen-
eral, this is the preferred approach since it balances data availability, query performance,
and update efficiency.
The cost/benefit of the replicated database allocation strategy can be estimated in
terms of storage cost, communication costs (query and update time), and data availabil-
ity. Figure 16.1 briefly illustrates the tradeoff by showing the data replication on the hor-
izontal axis and costs on the vertical axis. The following can be seen from Figure 16.1:
Search WWH ::




Custom Search