CLIENT/SERVER DATABASE AND DISTRIBUTED DATABASE - Fundamentals of Database Management Systems

Databases Reference

In-Depth Information

But if the nature of the data and of the applications that use it require all of

the data in the replicated tables worldwide always to be consistent, accurate, and

up-to-date, then a more complex ''synchronous'' procedure must be put in place.

While there are variations on this theme, the basic process for accomplishing this is

known as the '' two-phase commit .'' The two-phase commit works like this. Each

computer on the network has a special log file in addition to its database tables.

So, in Figure 12.9, each of the five cities has one of these special log files. Now,

when an update is to be made at one site, the distributed DBMS has to do several

things. It has to freeze all the replicated copies of the table involved, send the update

out to all the sites with the table copies, and then be sure that all the copies were

updated. After all of that happens, all of the replicated copies of the table will have

been updated and processing can resume. Remember that, for this to work properly,

either all of the replicated files must be updated or none of them must be updated.

What we don't want is for the update to take place at some of the sites and not at

the others, since this would obviously leave inconsistent results.

Let's look at an example using Table D in Figure 12.9. Copies of Table D are

located in Los Angeles, Memphis, and Paris. Say that someone issues an update

request to a record in Table D in Memphis. In the first or ''prepare'' phase of the

two-phase commit, the computer in Memphis sends the updated data to Los Angeles

and Paris. The computers in all three cities write the update to their logs (but not to

their actual copies of Table D at this point). The computers in Los Angeles and Paris

attempt to lock their copies of Table D to get ready for the update. If another process

is using their copy of Table D then they will not be able to do this. Los Angeles and

Paris then report back to Memphis whether or not they are in good operating shape

and whether or not they were able to lock Table D. The computer in Memphis takes

in all of this information and then decides whether to continue with the update or

to abort it. If Los Angeles and Paris report back that they are up and running and

were able to lock Table D, then the computer in Memphis will decide to go ahead

with the update. If the news from Los Angeles and Paris was bad, Memphis will

decide not to go ahead with the update. So, in the second or ''commit'' phase of

the two-phase commit, Memphis sends its decision to Los Angeles and Paris. If it

decides to complete the update, then all three cities transfer the updated data from

their logs to their copy of Table D. If it decides to abort the update, then none of

the sites transfer the updated data from their logs to their copy of Table D. All three

copies of Table D remain as they were and Memphis can start the process all over

again.

The two-phase commit is certainly a complex, costly, and time-consuming

process. It should be clear that the more volatile the data in the database is, the less

attractive is this type of synchronous procedure for updating replicated tables in the

distributed database.

Distributed Joins

Let'stakealookattheissueof distributed joins , which came up earlier. In a

distributed database in which no single computer (no single city) in the network

contains the entire database, there is the possibility that a query will be run from

one computer requiring a join of two or more tables that are not all at the same

computer. Consider the distributed database design in Figure 12.9. Let's say that a

query is issued at Los Angeles that requires the join of Tables E and F. First of all,

neither of the two tables is located at Los Angeles, the site that issued the query.

Fundamentals of Database Management Systems

Search WWH ::

Custom Search

Home