Databases Reference
In-Depth Information
Both I/O time and network delay time for these data allocation methods can be
accurately estimated using the simple mathematical estimation models given in
Appendix A.
TIPS AND INSIGHTS FOR DATABASE PROFESSIONALS
Tip 1. Determine when to replicate data across a network on a simple basis of
cost and benefits. Benefits occur when you can save network delay time and I/O
time by placing a copy of the data closer to the source of a query. Costs occur
when the extra copy of data must be updated every time the original data is
updated. Benefits and costs are both estimated in terms of elapsed time (network
delay time plus I/O time). In general you want to add a copy of data to a site when
the benefit exceeds the cost.
Tip 2. When benefits and costs are approximately equal, decide whether to
replicate data based on greater availability. When you have multiple copies of
data, the availability of data is greater. This could be a major concern when remote
sites go down unexpectedly and you have high-priority queries to satisfy. Analyze
the benefits of greater availability and make your decision about data replication
based on both tangible and intangible benefits.
Tip 3. When the workload is very complex, use the dominant transaction
approach. One of the disadvantages of the data allocation methods presented in
this chapter is the use of averages for query times and update times. This does not
take into account the possibility of dominant transactions whose I/O specifica-
tions are known, and in an environment where the network configuration and
network protocol details are given. Under such circumstances, the actual I/O
times and network delay times can be estimated for individual dominant transac-
tions instead of averages across all transactions. A dominant transaction is defined
by criteria such as high frequency of execution, high volume of data accessed, tight
response time constraints, and explicit priority.
16.5
Summary
Distributed database design requires one more step of analysis than centralized databases,
but there exists a set of basic principles we can use for everyday design decisions. Repli-
cated data allocation methods can be simply expressed and implemented to minimize the
time to execute a collection of transactions on a distributed database. The methods take
into account the execution times of remote and local database transactions for query and
update and the frequencies of these transactions. Good estimating techniques for average
I/O time and network delay costs can easily be applied to these methods.
Search WWH ::




Custom Search