DATABASE MANAGEMENT APPROACHES - Concepts of Database Management

Databases Reference

In-Depth Information

Even if the second message is lengthy, especially where many rows satisfied the conditions,

this second query strategy is a vast improvement over the first strategy. A small number of

lengthy messages is preferable to a large number of short messages.

Systems that are record-at-a-time-oriented can create severe performance problems in dis-

tributed systems. If the only choice is to transmit every record from one site to another site as a

message and then examine it at the other site, the communication time required can become

unacceptably high. DDBMSs that permit a request for a set of records, as opposed to an indi-

vidual record, outperform record-at-a-time systems.

● More complex treatment of concurrent update. Concurrent update in a distributed database is

treated basically the same way it is treated in nondistributed databases. A user transaction

acquires locks, and the locking is two-phase. (Locks are acquired in a growing phase, during which

time no locks are released and the DDBMS applies the updates. All locks are released during the

shrinking phase.) The DDBMS detects and breaks deadlocks, and then the DDBMS rolls back

interrupted transactions. The primary distinction lies not in the kinds of activities that take place,

but in the additional level of complexity created by the very nature of a distributed database.

If all the records to be updated by a particular transaction occur at one site, the problem is

essentially the same as in a nondistributed database. However, the records in a distributed data-

base might be stored at many different sites. Furthermore, if the data is replicated, each occur-

rence might be stored at several sites, each requiring the same update to be performed. Assuming

each record occurrence has replicas at three different sites, an update that would affect five

record occurrences in a nondistributed system might affect 20 different record occurrences in a

distributed system (each record occurrence together with its three replica occurrences).

Having more record occurrences to update is only part of the problem. Assuming each site

keeps its own locks, the DDBMS must send many messages for each record to be updated: a request

for a lock; a message indicating that the record is already locked by another user or that the lock

has been granted; a message directing that the update be performed; an acknowledgment of the

update; and, finally, a message indicating that the record is to be unlocked. Because all those mes-

sages must be sent for each record and its occurrences, the total time for an update can be sub-

stantially longer in a distributed database.

A partial solution to minimize the number of messages involves the use of the primary copy

mentioned earlier. Recall that one of the replicas of a given record occurrence is designated as the

primary copy. Locking the primary copy, rather than all copies, is sufficient and will reduce the

number of messages required to lock and unlock records. The number of messages might still be

large, however; and the unavailability of the primary copy can cause an entire transaction to fail.

Thus, even this partial solution presents problems.

Just as in a nondistributed database, deadlock is a possibility in a distributed database. In a dis-

tributed database, however, deadlock is more complicated because two types of deadlock, local

deadlock and global deadlock, are possible. Local deadlock is deadlock that occurs at a single site

in a distributed database. If each of two transactions is waiting for a record held by the other at

the same site, the local DBMS can detect and resolve the deadlock with a minimum number of

messages needed to communicate the situation to the other DBMSs in the distributed system.

On the other hand, global deadlock involves one transaction that requires a record held by

a second transaction at one site, while the second transaction requires a record held by the first

transaction at a different site. In this case, neither site has information individually to allow this

deadlock to be detected; this is a global deadlock, and it can be detected and resolved only by

sending a large number of messages between the DBMSs at the two sites.

The various factors involved in supporting concurrent update greatly add to the complexity

and the communications time in a distributed database.

● More complex recovery measures. Although the basic recovery process for a distributed data-

base is the same as the one described in Chapter 7, there is an additional potential problem. To

make sure that the database remains consistent, each database update should be made perma-

nent or aborted and undone, in which case, none of its changes will be made. In a distributed data-

base, with an individual transaction updating several local databases, it is possible—due to

281

Concepts of Database Management

Search WWH ::

Custom Search

Home