Databases Reference
In-Depth Information
Appendix B
R e d u n d a n c y R e v i s i t e d
Nothing is certain but the unforeseen
─19th century proverb
In Chapter 13, I discussed a normal form I called RFNF (redundancy free normal form). However, I did also
mention in that chapter the fact that that very same name had been used earlier in a paper by Millist W. Vincent to
mean something rather different. 1 In this appendix, I present a brief introduction to Vincent's RFNF.
Consider our usual suppliers relvar S, with its FD {CITY} → {STATUS} and sample value as shown in
Fig. 1.1. The tuple for supplier S1 in that relvar has city London and status 20; as a consequence, the tuple for
supplier S4, which also has city London, must have status 20, for otherwise the FD {CITY} → {STATUS} would
be violated. In a sense, therefore, the occurrence of that status value 20 in the tuple for supplier S4 is redundant,
because there's nothing else it could possibly be─it's a logical consequence of, and is fully determined by, the
values appearing elsewhere in the relation that's the current value of the relvar at the time in question.
Examples like the foregoing provide the motivation for the following intuitively attractive definition (due to
Vincent but considerably paraphrased here):
Definition: Let relation r be a value of relvar R , let t be a tuple in r , and let v be an attribute value within t .
Then that occurrence of v within t is redundant in r , and R is subject to redundancy , if and only if
replacing that occurrence of v by an occurrence of v′ ( v′ v ), while leaving everything else unchanged,
causes some FD or JD of R to be violated.
In other words, redundancy exists if the attribute value occurrence in question must be v and nothing else.
Aside: Even though I said the foregoing definition is intuitively attractive (and I think it is), I think I should
also point out that in at least one respect it's a little strange, too. Consider the motivating example again, in
which the tuple in relvar S for supplier S4 had to have status value 20 because the tuple for supplier S1 had
status value 20. Observe now that the reverse argument holds equally well: The tuple for supplier S1 has to
have status value 20 because the tuple for supplier S4 has status value 20! Now, it surely makes no sense to
say those 20's are both redundant (does it?) ─but the fact that it appears to be arbitrary as to which of the
two we regard as the redundant one does seem a little odd, at least to me. End of aside .
Be that as it may, Vincent goes on to define a new normal form, as follows: 2
Definition: Relvar R is in (Vincent's) redundancy free normal form , RFNF, if and only if it's not subject
to redundancy as just defined.
1 Millist W. Vincent: “Redundancy Elimination and a New Normal Form for Relational Database Design,” in B. Thalheim and L.Libkin (eds.),
Semantics in Databases . Berlin, FDR: Springer-Verlag (1998).
2 Note that the arbitrariness just referred to has no impact on this definition (perhaps fortunately).
Search WWH ::

Custom Search