Databases Reference
In-Depth Information
Even if the second message is lengthy, especially where many rows satisfied the conditions,
this second query strategy is a vast improvement over the first strategy. A small number of
lengthy messages is preferable to a large number of short messages.
Systems that are record-at-a-time-oriented can create severe performance problems in dis-
tributed systems. If the only choice is to transmit every record from one site to another site as a
message and then examine it at the other site, the communication time required can become
unacceptably high. DDBMSs that permit a request for a set of records, as opposed to an indi-
vidual record, outperform record-at-a-time systems.
More complex treatment of concurrent update. Concurrent update in a distributed database is
treated basically the same way it is treated in nondistributed databases. A user transaction
acquires locks, and the locking is two-phase. (Locks are acquired in a growing phase, during which
time no locks are released and the DDBMS applies the updates. All locks are released during the
shrinking phase.) The DDBMS detects and breaks deadlocks, and then the DDBMS rolls back
interrupted transactions. The primary distinction lies not in the kinds of activities that take place,
but in the additional level of complexity created by the very nature of a distributed database.
If all the records to be updated by a particular transaction occur at one site, the problem is
essentially the same as in a nondistributed database. However, the records in a distributed data-
base might be stored at many different sites. Furthermore, if the data is replicated, each occur-
rence might be stored at several sites, each requiring the same update to be performed. Assuming
each record occurrence has replicas at three different sites, an update that would affect five
record occurrences in a nondistributed system might affect 20 different record occurrences in a
distributed system (each record occurrence together with its three replica occurrences).
Having more record occurrences to update is only part of the problem. Assuming each site
keeps its own locks, the DDBMS must send many messages for each record to be updated: a request
for a lock; a message indicating that the record is already locked by another user or that the lock
has been granted; a message directing that the update be performed; an acknowledgment of the
update; and, finally, a message indicating that the record is to be unlocked. Because all those mes-
sages must be sent for each record and its occurrences, the total time for an update can be sub-
stantially longer in a distributed database.
A partial solution to minimize the number of messages involves the use of the primary copy
mentioned earlier. Recall that one of the replicas of a given record occurrence is designated as the
primary copy. Locking the primary copy, rather than all copies, is sufficient and will reduce the
number of messages required to lock and unlock records. The number of messages might still be
large, however; and the unavailability of the primary copy can cause an entire transaction to fail.
Thus, even this partial solution presents problems.
Just as in a nondistributed database, deadlock is a possibility in a distributed database. In a dis-
tributed database, however, deadlock is more complicated because two types of deadlock, local
deadlock and global deadlock, are possible. Local deadlock is deadlock that occurs at a single site
in a distributed database. If each of two transactions is waiting for a record held by the other at
the same site, the local DBMS can detect and resolve the deadlock with a minimum number of
messages needed to communicate the situation to the other DBMSs in the distributed system.
On the other hand, global deadlock involves one transaction that requires a record held by
a second transaction at one site, while the second transaction requires a record held by the first
transaction at a different site. In this case, neither site has information individually to allow this
deadlock to be detected; this is a global deadlock, and it can be detected and resolved only by
sending a large number of messages between the DBMSs at the two sites.
The various factors involved in supporting concurrent update greatly add to the complexity
and the communications time in a distributed database.
More complex recovery measures. Although the basic recovery process for a distributed data-
base is the same as the one described in Chapter 7, there is an additional potential problem. To
make sure that the database remains consistent, each database update should be made perma-
nent or aborted and undone, in which case, none of its changes will be made. In a distributed data-
base, with an individual transaction updating several local databases, it is possible—due to
281
 
Search WWH ::




Custom Search