Databases Reference
In-Depth Information
implicitly). If DB additionally contains some distinct appearance of some tuple, or some combination of
tuples, that also represents p (either explicitly or implicitly), then DB contains, and D permits, redundancy .
The principles of normalization and The Principle of Orthogonal Design are aimed precisely at reducing
redundancy in the foregoing narrow sense. Observe, however, that all the definition says is if (not if and only if )
certain tuples appear, then there's redundancy; i.e., it's not a complete definition. Indeed, we'll see several
examples later in this chapter that still involve redundancy, even though they don't contain distinct tuples or tuple
combinations that represent the same proposition. What's more, most of the examples in question are fully
normalized and fully orthogonal. In other words, the principles of normalization and orthogonality, though
necessary and undoubtedly important, are far from being sufficient.
A LITTLE HISTORY
Before I get into a discussion of how and why normalization and orthogonality are insufficient, I'd like to say a little
more about Codd's attempts, in his very first two papers, to address the redundancy issue. In his 1969 paper, he said
this:
A set of relations is strongly redundant if it contains at least one relation [that] is derivable from the rest of the [relations
in the set].
And he tightened up this definition somewhat in the 1970 paper:
A set of relations is strongly redundant if it contains at least one relation that possesses a projection [that] is derivable
from other projections of relations in the set.
I should explain that when Codd says a relation r is derivable from a set S of relations, he means r is equal to
the result of applying some sequence of relational operations (join, projection, and so forth) to relations from S . I do
have a few comments on his definitions, however:
First, the term relation should be replaced by the term relvar throughout. (Of course, this latter term wasn't
introduced until many years later, and Codd never used it at all.)
Second, we can ignore the qualifier strongly . Codd was distinguishing between “strong” redundancy and
what he called weak redundancy, but weak redundancy is irrelevant as far as we're concerned. The reason is
that weak redundancy has to do merely with equality dependencies that don't hold at all times but do happen
to hold at particular times, given the relation values that happen to exist at the times in question. In fact, it
seems to me that Codd was struggling here with the logical difference between relations and relvars!─see the
previous bullet item. “Strong” redundancy applies to relvars (it's what we usually mean by redundancy
when we talk about database design). “Weak” redundancy, by contrast, applies to relations, not relvars (it's
just an artifact of the values the relvars happen to have at some particular time).
The 1969 definition is subsumed by the 1970 definition, of course, because (as we know from Chapter 6)
every relvar R is identically equal to a certain projection of R ─namely, the identity projection.
More to the point, the 1970 definition is still deficient as a definition of redundancy in general, for at least the
following two reasons:

Search WWH ::

Custom Search