Databases Reference
In-Depth Information
Explanation: What this constraint says is, if we join SNC and ST, we get a result—call it S—in which the
number of distinct cities is equal to the number of distinct city / status pairs. And the fact that this latter
property holds is equivalent to saying the FD {CITY} → {STATUS} holds in the original relvar S.
So we've “lost” an FD. What are the implications? Well, certainly the multirelvar constraint that replaces it
is harder to state, as we've just seen. More to the point, perhaps, it's harder to enforce (harder, that is, than it would
have been with the preferred decomposition into projections SNC and CT, as illustrated in Fig. 3.2 in Chapter 3). 2
For example, suppose we update relvar SNC to change the city for supplier S1 from London to Athens; then we
must also update relvar ST to change the status for supplier S1 from 20 to 30—because if we don't, then joining
SNC and ST back together will produce a result that isn't a legitimate value for relvar S. (By contrast, if we update
relvar SNC to change the city for supplier S2 from Paris to Athens, then we don't have to update relvar ST as well—
but we still have to inspect relvar ST in order to determine that fact.)
Aside: It might be possible, given a well architected DBMS, to get the system to do that necessary inspection
of relvar ST “automatically,” instead of the user having to do it. It might even be possible to get the system
to perform any necessary additional updates “automatically,” too. Even given such a system, however, it's
still the case that the constraint is harder to enforce (i.e., more work still has to be done, even if it's done by
the system and not the user). In any case, such possibilities are just a pipedream at the time of writing—
today's commercial products typically don't allow multirelvar constraints even to be stated, in general; the
foregoing possibilities are out of reach today, and dealing with (and in particular enforcing) such constraints
is thus the user's responsibility. End of aside.
So the message is: Try to choose a decomposition that preserves FDs instead of losing them. (In the case at
hand, replacing the projection on {SNO,STATUS} by that on {CITY,STATUS} solves the problem.) Loosely
speaking, in other words, if the FD X Y holds in the original relvar, try not to choose a decomposition in which X
winds up in one relvar and Y in another. Note: Of course, I'm assuming here that the decomposition isn't being
done on the basis of that FD X Y itself—because if it is, we'll effectively wind up with two X 's, one of which will
be in the same relvar as Y (necessarily so) and the other won't. I'm also assuming, tacitly, that X Y is part of
what's called an irreducible cover for the total set of FDs that hold in the original relvar. I'll be discussing
irreducible covers later in this chapter.
The basic idea of FD preservation is straightforward; unfortunately, however, there's quite a bit more that needs to
be said on the subject. First of all, I want to present what some people might regard as a pathological example.
We're given a relvar SJT with attributes S (student), J (subject), and T (teacher), and predicate Student S is taught
subject J by teacher T . The following business rules apply: 3
2 There might be performance penalties, too. Now, I shouldn't really mention this fact; as I indicated in Chapter 1, I never want performance
considerations to be the driving force behind my logical design. But in the case at hand, performance is just an additional point that happens to
reinforce my main argument.
3 A business rule is a statement, usually in natural language, that's supposed to capture some aspect of what the data in the database means or how
it's constrained. There's no consensus on any more precise definition of the term, though most writers would at least agree that relvar predicates
are an important special case.
Search WWH ::

Custom Search