Database Reference
In-Depth Information
Even if the second message is lengthy, especially where many rows satisfied the conditions,
this second query strategy is a vast improvement over the first strategy. A small number of
lengthy messages is preferable to a large number of short messages.
Systems that are record-at-a-time-oriented can create severe performance problems in
distributed systems. If the only choice is to transmit every record from one site to another site
as a message and then examine it at the other site, the communication time required can
become unacceptably high. DDBMSs that permit a request for a set of records, as opposed to an
individual record, outperform record-at-a-time systems.
￿
More complex treatment of concurrent update. Concurrent update in a distributed database is
treated basically the same way it is treated in nondistributed databases. A user transaction
acquires locks, and the locking is two-phase. (Locks are acquired in a growing phase, during
which time no locks are released and the DDBMS applies the updates. All locks are released
during the shrinking phase.) The DDBMS detects and breaks deadlocks, and then the DDBMS
rolls back interrupted transactions. The primary distinction lies not in the kinds of activities that
take place, but in the additional level of complexity created by the very nature of a distributed
database.
If all the records to be updated by a particular transaction occur at one site, the problem is
essentially the same as in a nondistributed database. However, the records in a distributed
database might be stored at many different sites. Furthermore, if the data is replicated, each
occurrence might be stored at several sites, each requiring the same update to be performed.
Assuming each record occurrence has replicas at three different sites, an update that would
affect 5 record occurrences in a nondistributed system might affect 20 different record occur-
rences in a distributed system (each record occurrence together with its three replica
occurrences).
Having more record occurrences to update is only part of the problem. Assuming each site
keeps its own locks, the DDBMS must send many messages for each record to be updated: a
request for a lock, a message indicating that the record is already locked by another user or that
the lock has been granted, a message directing that the update be performed, an acknowledg-
ment of the update, and, finally, a message indicating that the record is to be unlocked. Because
all those messages must be sent for each record and its occurrences, the total time for an update
can be substantially longer in a distributed database.
A partial solution to minimize the number of messages involves the use of the primary copy
mentioned earlier. Recall that one of the replicas of a given record occurrence is designated as
the primary copy. Locking the primary copy, rather than all copies, is sufficient and will reduce
the number of messages required to lock and unlock records. The number of messages might still
be large, however, and the unavailability of the primary copy can cause an entire transaction to
fail. Thus, even this partial solution presents problems.
Just as in a nondistributed database, deadlock is a possibility in a distributed database. In a
distributed database, however, deadlock is more complicated because two types of deadlock, local
deadlock and global deadlock, are possible. Local deadlock is deadlock that occurs at a single site
in a distributed database. If each of two transactions is waiting for a record held by the other at
the same site, the local DBMS can detect and resolve the deadlock with a minimum number of
messages needed to communicate the situation to the other DBMSs in the distributed system.
On the other hand, global deadlock involves one transaction that requires a record held by
a second transaction at one site, while the second transaction requires a record held by the first
transaction at a different site. In this case, neither site has information individually to allow this
deadlock to be detected; this is a global deadlock, and it can be detected and resolved only by
sending a large number of messages between the DBMSs at the two sites.
The various factors involved in supporting concurrent update greatly add to the complexity
and the communications time in a distributed database.
283
￿
More complex recovery measures. Although the basic recovery process for a distributed data-
base is the same as the one described in Chapter 7, there is an additional potential problem. To
make sure that the database remains consistent, each database update should be made perma-
nent or aborted and undone, in which case, none of its changes will be made. In a distributed
database, with an individual transaction updating several local databases, it is possible
because
Search WWH ::




Custom Search