Preserving FDs - Database Design and Relational Theory

Databases Reference

In-Depth Information

Explanation: What this constraint says is, if we join SNC and ST, we get a result—call it S—in which the

number of distinct cities is equal to the number of distinct city / status pairs. And the fact that this latter

property holds is equivalent to saying the FD {CITY} → {STATUS} holds in the original relvar S.

So we've “lost” an FD. What are the implications? Well, certainly the multirelvar constraint that replaces it

is harder to state, as we've just seen. More to the point, perhaps, it's harder to enforce (harder, that is, than it would

have been with the preferred decomposition into projections SNC and CT, as illustrated in Fig. 3.2 in Chapter 3). 2

For example, suppose we update relvar SNC to change the city for supplier S1 from London to Athens; then we

must also update relvar ST to change the status for supplier S1 from 20 to 30—because if we don't, then joining

SNC and ST back together will produce a result that isn't a legitimate value for relvar S. (By contrast, if we update

relvar SNC to change the city for supplier S2 from Paris to Athens, then we don't have to update relvar ST as well—

but we still have to inspect relvar ST in order to determine that fact.)

Aside: It might be possible, given a well architected DBMS, to get the system to do that necessary inspection

of relvar ST “automatically,” instead of the user having to do it. It might even be possible to get the system

to perform any necessary additional updates “automatically,” too. Even given such a system, however, it's

still the case that the constraint is harder to enforce (i.e., more work still has to be done, even if it's done by

the system and not the user). In any case, such possibilities are just a pipedream at the time of writing—

today's commercial products typically don't allow multirelvar constraints even to be stated, in general; the

foregoing possibilities are out of reach today, and dealing with (and in particular enforcing) such constraints

is thus the user's responsibility. End of aside.

So the message is: Try to choose a decomposition that preserves FDs instead of losing them. (In the case at

hand, replacing the projection on {SNO,STATUS} by that on {CITY,STATUS} solves the problem.) Loosely

speaking, in other words, if the FD X → Y holds in the original relvar, try not to choose a decomposition in which X

winds up in one relvar and Y in another. Note: Of course, I'm assuming here that the decomposition isn't being

done on the basis of that FD X → Y itself—because if it is, we'll effectively wind up with two X 's, one of which will

be in the same relvar as Y (necessarily so) and the other won't. I'm also assuming, tacitly, that X → Y is part of

what's called an irreducible cover for the total set of FDs that hold in the original relvar. I'll be discussing

irreducible covers later in this chapter.

AN UNFORTUNATE CONFLICT

The basic idea of FD preservation is straightforward; unfortunately, however, there's quite a bit more that needs to

be said on the subject. First of all, I want to present what some people might regard as a pathological example.

We're given a relvar SJT with attributes S (student), J (subject), and T (teacher), and predicate Student S is taught

subject J by teacher T . The following business rules apply: 3

2 There might be performance penalties, too. Now, I shouldn't really mention this fact; as I indicated in Chapter 1, I never want performance

considerations to be the driving force behind my logical design. But in the case at hand, performance is just an additional point that happens to

reinforce my main argument.

3 A business rule is a statement, usually in natural language, that's supposed to capture some aspect of what the data in the database means or how

it's constrained. There's no consensus on any more precise definition of the term, though most writers would at least agree that relvar predicates

are an important special case.

Search WWH ::

Custom Search

Home