Normalization: Some Generalities - Database Design and Relational Theory

Databases Reference

In-Depth Information

from relvar SNC. After that update, we could say, a trifle loosely, 7 that relvar S contains a tuple that has no

counterpart in relvar SNC (though it does have one in relvar CT). So again the two designs aren't really information

equivalent; and this time relvar S isn't exactly a “join” of relvars SNC and CT, since it contains a tuple that doesn't

correspond to any tuple in relvar SNC.

The two designs are thus not information equivalent after all. But didn't I say earlier that “losslessness” of

decompositions is an important property? Don't we generally assume that if Design B is produced by normalizing

Design A , then Design B and Design A are supposed to be information equivalent? What exactly is going on here?

In order to answer these questions, it's helpful to look at the relvar predicates. The predicate for SNC is:

Supplier SNO is named SNAME and is located in city CITY.

And the predicate for CT is:

City CITY has status STATUS.

Now suppose it's possible for a city to have a status even if no supplier is located in that city; in other words,

suppose it's possible for relvar CT to contain a tuple such as (Rome,10) that has no counterpart in relvar SNC. 8

Then the design consisting of just relvar S is simply incorrect. That is, if it's possible for a true instantiation to exist

of the predicate City CITY has status STATUS without there existing—at the same time and with the same CITY

value—a true instantiation of the predicate Supplier SNO is named SNAME and is located in city CITY , then a

design consisting just of relvar S doesn't faithfully reflect the state of affairs in the real world (because that design is

incapable of representing the status for a city in which no supplier is located).

Similarly, suppose it's possible for a supplier to be located in a city even if that city has no status; in other

words, suppose it's possible for relvar SNC to contain a tuple, say (S6,Lopez,Madrid), that has no counterpart in

relvar CT. Then, again, the design consisting just of relvar S is simply incorrect, because it requires every city in

which a supplier is located to have some status.

Here's another way to look at the foregoing argument. Suppose the design consisting just of relvar S did

faithfully reflect the state of affairs in the real world after all. Then relvars SNC and CT would be subject to the

following integrity constraint (“Every city in SNC appears in CT and vice versa”):

CONSTRAINT ... SNC { CITY } = CT { CITY } ;

But this constraint—which is an example of what I'm later going to be calling an equality dependency or EQD—

manifestly isn't satisfied in the example under discussion. Note: For simplicity, I haven't bothered to give this

constraint a name, as you can see. Indeed, I'll omit such names from all of my examples in this topic from this point

forward, except where there's some compelling reason to do otherwise.

To sum up, we see that normalization can be (and is) used to address two rather different problems:

7 In effect, by pretending relvars S, SNC, and CT all coexist (living alongside one another, as it were).

8 Here I'm adopting a sloppy convention by which the single quotes that ought really to enclose character string values are omitted in regular text,

thereby writing (Rome,10) instead of ('Rome',10). What's more, I'll adhere to this convention from this point forward.

Search WWH ::

Custom Search

Home