Databases Reference
In-Depth Information
talking about normalization in the classical sense here at all; a fortiori, therefore, we're not talking about classical
denormalization either. 7
I'll give one more example from the denormalization paper. This one starts with STORE and EMP relvars as
follows:
STORE { SNO , REGION , STATE , ... }
KEY { SNO , REGION , STATE }
EMP { ENO , SNO , REGION , STATE , ... }
KEY { ENO }
FOREIGN KEY { SNO , REGION , STATE } REFERENCES STORE
The predicates are Store SNO is located in region REGION within state STATE and Employee ENO is employed at
store SNO within region REGION within state STATE . The redundancies are obvious, and so the suggestion is to
introduce a surrogate identifier for stores, SID say, and thereby modify the design as follows:
STORE { SID , SNO , REGION , STATE , ... }
KEY { SID }
KEY { SNO , REGION , STATE }
EMP { ENO , SID , ... }
KEY { ENO }
FOREIGN KEY { SID } REFERENCES STORE
But this revised design not only involves no denormalization, it actually decreases redundancy! 8 —because the
association of a given SNO with a given REGION and STATE now appears just once, instead of once for every
employee of the store in question. (To spell the point out, it's obviously not denormalization because—among other
things—the one thing surely everybody agrees on is that denormalization is supposed to increase redundancy.)
By the way, I'm aware this last example might give the impression that I think surrogates are a good idea.
Sadly, however, they aren't always a good idea. The fact is, surrogates, while they might solve some problems, can
also introduce further problems of their own. See Exercise 8.3, also Chapter 15, for further discussion.
In closing this section, I'd like to make it very clear that the foregoing discussions are in no way intended as
an attack on the denormalization paper or its author. Indeed, the following quote from that paper should make it
clear that the author and I are really on the same side on the bigger issues:
[We should] stop criticizing the relational model and make a clear distinction between what's SQL and what's relational
... The two are totally different.
I couldn't agree more with this position, nor with the implication that the only reason we have to worry about
such matters as denormalizing at the logical level is because of failures on the part of today's SQL products. As I've
written elsewhere, in fact: 9 In an ideal system, we would never have to denormalize at all, at the logical level. Even
7 It's true that it might be possible to define a new kind of normalization, based on restriction and union instead of projection and join (I'll have
more to say about this possibility in Part IV of this topic). And if we did, well, I suppose we'd have a new kind of denormalization on our hands
also. But I'm pretty sure that such considerations aren't what the denormalization paper was referring to with its RESELLERS example.
8 Or does it? Again, see Chapter 15, which includes further discussion of the use of surrogates in particular.
9 E.g., in my topic An Introduction to Database Systems (8th edition, Addison-Wesley, 2004). For further discussion, see my topic Go Faster!
The TransRelational tm Approach to DBMS Implementation (Ventus Publishing, 2002, 2011).
Search WWH ::




Custom Search