Normalization: Some Generalities - Database Design and Relational Theory

Databases Reference

In-Depth Information

1.

It can be used to fix a logically incorrect design, as in the example discussed earlier in this section. Exercise:

Do issues analogous to those raised in that example apply to the STP example from the section “The Place of

Design Theory” in Chapter 1? ( Answer: Yes, they do.)

2.

It can be used to reduce redundancy in an otherwise logically correct design. (Obviously a design doesn't

have to be logically incorrect in the foregoing sense in order to display redundancy.)

Much confusion arises in practice because these two cases are often not clearly distinguished. Indeed, most

of the literature focuses on Case 2—and for definiteness I'll assume Case 2 myself in what follows, where it makes

any difference—but please don't lose sight of Case 1, which in practice is at least as important, if not more so.

I should point out further that, strictly speaking, the terminology of projections and joins applies only to Case

2. That's because in Case 1, as we've seen, the “new” relvars aren't necessarily projections of the “old” one, nor is

the “old” one necessarily the join of the “new” ones (if you see what I mean). In fact, what does it mean to talk

about projections and joins of relvars (as opposed to relations) anyway? Well, as I've written elsewhere: 9

By definition, the operators projection, join, and so on apply to relation values specifically. In particular, of course, they

apply to the values that happen to be the current values of relvars. It thus clearly makes sense to talk about, e.g., the

projection of relvar S on attributes {CITY,STATUS}, meaning the relation that results from taking the projection on

those attributes of the relation that's the current value of that relvar S. In some contexts, however (normalization, for

example), it turns out to be convenient to use expressions like “the projection of relvar S on attributes {CITY,STATUS}”

in a slightly different sense. To be specific, we might say, loosely but very conveniently, that some relvar , CT, is the

projection of relvar S on attributes {CITY,STATUS}—meaning, more precisely, that the value of relvar CT at all times

is the projection on those attributes of the value of relvar S at the time in question. In a sense, therefore, we can talk in

terms of projections of relvars per se, rather than just in terms of projections of current values of relvars. Analogous

remarks apply to all of the relational operations.

In other words, we do still use the projection/join terminology, even in Case 1. Such talk is somewhat

inappropriate—not to say sloppy—but it is at least succinct. But it would really be more accurate to say, not that

decomposition is a process of taking projections as such, but rather that it's a process that's reminiscent of , but not

quite the same as, what we do when we take projections (and similarly for recomposition and join).

UPDATE ANOMALIES

The concept of update anomalies is frequently mentioned in connection with normalization. Now, it should be clear

that redundancy of any kind can always lead to anomalies—because redundancy means, loosely, that some piece of

information is represented twice, and so there's always the possibility that the two representations don't agree (i.e.,

if one is updated and the other isn't). More specifically, let's consider the case of relvar S, where the following FD

holds:

9 E.g., in The Relational Database Dictionary, Extended Edition (Apress, 2008).

Search WWH ::

Custom Search

Home