Preserving FDs - Database Design and Relational Theory

Databases Reference

In-Depth Information

to Fig. 6.2 again, we can't delete the fact that Jones is studying Physics without losing the information that Professor

Brown teaches Physics.

Now, we can get over these problems by decomposing the relvar appropriately. Applying Heath's Theorem

to the FD {T} → {J} (take X , Y , and Z to be {T}, {J}, and {S}, respectively), we obtain the following nonloss

decomposition:

TJ { T , J }

KEY { T }

TS { T , S }

KEY { T , S }

I'll leave it as an exercise to show the values of these two relvars corresponding to the value of SJT shown in

Fig. 6.2, to show they're in BCNF, and to check that the decomposition does in fact avoid the redundancy and

update anomalies mentioned above. Observe in particular that the FD {T} → {J} becomes a key constraint in this

decomposition; in the original design, by contrast, it had to be stated and enforced separately.

There's another problem, though. The fact is, although the decomposition into TJ and TS does avoid certain

anomalies, it unfortunately introduces others. To be specific, the FD

{ S , J } → { T }

is lost (certainly it isn't implied by the FD {T} → {J}, which is the only nontrivial FD to hold in the result of the

decomposition). As a consequence, relvars TJ and TS can't be independently updated. For example, an attempt to

insert the tuple

( Smith , Prof. Brown )

into TS must be rejected, because Professor Brown teaches Physics and Smith is already being taught Physics by

Professor Green; yet this fact can't be detected without inspecting TJ.

To sum up, what the foregoing example illustrates is as follows: There are two objectives we typically aim

for in nonloss decomposition, BCNF projections and FD preservation, and, sadly, these objectives can be in conflict

with one another (i.e., it isn't always possible to achieve both).

Now, at this point in an earlier draft of this chapter, I wrote the following:

So which objective do we give up on? Well, I'd tell you if I could, but I can't. What the SJT example demonstrates is

that the theory of normalization, important though it is, isn't enough as it stands; I mean, there are questions it doesn't

answer. So the message is: We need more science! Normalization theory is certainly scientific, but it doesn't solve all

design problems.

At the prompting of one of my reviewers, however, I've come to the conclusion that this paragraph is probably

overstated. It's not so much more science we need here, it's better implementations! That is, the main argument for

tolerating a less than properly normalized design, in cases like the one at hand, is the fact that today's DBMSs make

it quite awkward to deal with multirelvar constraints like the “lost” FD in the example. So let me set a stake in the

Search WWH ::

Custom Search

Home