DATABASE MANAGEMENT APPROACHES - Concepts of Database Management

Database Reference

In-Depth Information

Even if the second message is lengthy, especially where many rows satisfied the conditions,

this second query strategy is a vast improvement over the first strategy. A small number of

lengthy messages is preferable to a large number of short messages.

Systems that are record-at-a-time-oriented can create severe performance problems in

distributed systems. If the only choice is to transmit every record from one site to another site

as a message and then examine it at the other site, the communication time required can

become unacceptably high. DDBMSs that permit a request for a set of records, as opposed to an

individual record, outperform record-at-a-time systems.

More complex treatment of concurrent update. Concurrent update in a distributed database is

treated basically the same way it is treated in nondistributed databases. A user transaction

acquires locks, and the locking is two-phase. (Locks are acquired in a growing phase, during

which time no locks are released and the DDBMS applies the updates. All locks are released

during the shrinking phase.) The DDBMS detects and breaks deadlocks, and then the DDBMS

rolls back interrupted transactions. The primary distinction lies not in the kinds of activities that

take place, but in the additional level of complexity created by the very nature of a distributed

database.

If all the records to be updated by a particular transaction occur at one site, the problem is

essentially the same as in a nondistributed database. However, the records in a distributed

database might be stored at many different sites. Furthermore, if the data is replicated, each

occurrence might be stored at several sites, each requiring the same update to be performed.

Assuming each record occurrence has replicas at three different sites, an update that would

affect 5 record occurrences in a nondistributed system might affect 20 different record occur-

rences in a distributed system (each record occurrence together with its three replica

occurrences).

Having more record occurrences to update is only part of the problem. Assuming each site

keeps its own locks, the DDBMS must send many messages for each record to be updated: a

request for a lock, a message indicating that the record is already locked by another user or that

the lock has been granted, a message directing that the update be performed, an acknowledg-

ment of the update, and, finally, a message indicating that the record is to be unlocked. Because

all those messages must be sent for each record and its occurrences, the total time for an update

can be substantially longer in a distributed database.

A partial solution to minimize the number of messages involves the use of the primary copy

mentioned earlier. Recall that one of the replicas of a given record occurrence is designated as

the primary copy. Locking the primary copy, rather than all copies, is sufficient and will reduce

the number of messages required to lock and unlock records. The number of messages might still

be large, however, and the unavailability of the primary copy can cause an entire transaction to

fail. Thus, even this partial solution presents problems.

Just as in a nondistributed database, deadlock is a possibility in a distributed database. In a

distributed database, however, deadlock is more complicated because two types of deadlock, local

deadlock and global deadlock, are possible. Local deadlock is deadlock that occurs at a single site

in a distributed database. If each of two transactions is waiting for a record held by the other at

the same site, the local DBMS can detect and resolve the deadlock with a minimum number of

messages needed to communicate the situation to the other DBMSs in the distributed system.

On the other hand, global deadlock involves one transaction that requires a record held by

a second transaction at one site, while the second transaction requires a record held by the first

transaction at a different site. In this case, neither site has information individually to allow this

deadlock to be detected; this is a global deadlock, and it can be detected and resolved only by

sending a large number of messages between the DBMSs at the two sites.

The various factors involved in supporting concurrent update greatly add to the complexity

and the communications time in a distributed database.

283

More complex recovery measures. Although the basic recovery process for a distributed data-

base is the same as the one described in Chapter 7, there is an additional potential problem. To

make sure that the database remains consistent, each database update should be made perma-

nent or aborted and undone, in which case, none of its changes will be made. In a distributed

database, with an individual transaction updating several local databases, it is possible

—

because

Concepts of Database Management

Search WWH ::

Custom Search

Home