Denormalization - Database Design and Relational Theory

Databases Reference

In-Depth Information

talking about normalization in the classical sense here at all; a fortiori, therefore, we're not talking about classical

denormalization either. 7

I'll give one more example from the denormalization paper. This one starts with STORE and EMP relvars as

follows:

STORE { SNO , REGION , STATE , ... }

KEY { SNO , REGION , STATE }

EMP { ENO , SNO , REGION , STATE , ... }

KEY { ENO }

FOREIGN KEY { SNO , REGION , STATE } REFERENCES STORE

The predicates are Store SNO is located in region REGION within state STATE and Employee ENO is employed at

store SNO within region REGION within state STATE . The redundancies are obvious, and so the suggestion is to

introduce a surrogate identifier for stores, SID say, and thereby modify the design as follows:

STORE { SID , SNO , REGION , STATE , ... }

KEY { SID }

KEY { SNO , REGION , STATE }

EMP { ENO , SID , ... }

KEY { ENO }

FOREIGN KEY { SID } REFERENCES STORE

But this revised design not only involves no denormalization, it actually decreases redundancy! 8 —because the

association of a given SNO with a given REGION and STATE now appears just once, instead of once for every

employee of the store in question. (To spell the point out, it's obviously not denormalization because—among other

things—the one thing surely everybody agrees on is that denormalization is supposed to increase redundancy.)

By the way, I'm aware this last example might give the impression that I think surrogates are a good idea.

Sadly, however, they aren't always a good idea. The fact is, surrogates, while they might solve some problems, can

also introduce further problems of their own. See Exercise 8.3, also Chapter 15, for further discussion.

In closing this section, I'd like to make it very clear that the foregoing discussions are in no way intended as

an attack on the denormalization paper or its author. Indeed, the following quote from that paper should make it

clear that the author and I are really on the same side on the bigger issues:

[We should] stop criticizing the relational model and make a clear distinction between what's SQL and what's relational

... The two are totally different.

I couldn't agree more with this position, nor with the implication that the only reason we have to worry about

such matters as denormalizing at the logical level is because of failures on the part of today's SQL products. As I've

written elsewhere, in fact: 9 In an ideal system, we would never have to denormalize at all, at the logical level. Even

7 It's true that it might be possible to define a new kind of normalization, based on restriction and union instead of projection and join (I'll have

more to say about this possibility in Part IV of this topic). And if we did, well, I suppose we'd have a new kind of denormalization on our hands

also. But I'm pretty sure that such considerations aren't what the denormalization paper was referring to with its RESELLERS example.

8 Or does it? Again, see Chapter 15, which includes further discussion of the use of surrogates in particular.

9 E.g., in my topic An Introduction to Database Systems (8th edition, Addison-Wesley, 2004). For further discussion, see my topic Go Faster!

The TransRelational tm Approach to DBMS Implementation (Ventus Publishing, 2002, 2011).

Search WWH ::

Custom Search

Home