Databases Reference

In-Depth Information

kind of redundancy we've been talking about throughout this topic so far (i.e., the kind that can be eliminated by

taking projections). Now, I hope you find this state of affairs as surprising as I did when I first encountered it, in

2010! 5NF was defined by Fagin in 1979, and for the next 30 years or so it was widely believed that a relvar had to

be in 5NF in order to be redundancy free (meaning 5NF was certainly necessary, though not in general sufficient, in

order to eliminate the kind of redundancy we've been talking about). It turns out, however, that 5NF isn't absolutely

necessary to achieve that goal after all; more specifically, it turns out that a new normal form, one that's strictly

weaker than 5NF and yet stronger than fourth normal form (4NF), is precisely as effective as 5NF at eliminating

redundancy. In fact, that new normal form─which we call, for obvious reasons,
redundancy free normal form

(RFNF)─turns out to be both necessary and sufficient for the purpose. If the goal of normalization is to reduce

redundancy, therefore, RFNF, not 5NF, is the target to be aimed for.

Aside:
The name
RFNF
as used here is taken from a preliminary draft of a recent paper by Hugh Darwen,

Ron Fagin, and myself (see Appendix G). As this topic was going to press, however, we discovered that an

earlier paper (“Redundancy Elimination and a New Normal Form for Relational Database Design,” by Millist

W. Vincent, 1998) had already used that name to refer to something else (more specifically, something

different from “our” normal form). For obvious reasons, therefore, we intend to choose a new name for our

RFNF;
8
for present purposes, however, I'll continue to use the name
RFNF
to refer to our normal form,

barring explicit statements to the contrary. Apologies if this causes any confusion. I'll have more to say

about Vincent's RFNF, as well as ours, in Appendix B.
End of aside.

To illustrate these ideas, I'll begin with an example: a modified form─I'll call it SPJ′─of relvar SPJ from

Chapter 9. As before, the relvar has attributes SNO (supplier number), PNO (part number), and JNO (project

number), and the predicate too is the same as before:
Supplier SNO supplies part PNO to project JNO
. Also, the

following business rule is in effect (again as before):

1.

If supplier
s
supplies part
p
and part
p
is supplied to project
j
and project
j
is supplied by supplier
s
, then

supplier
s
supplies part
p
to project
j
.

However, suppose now that the following business rule is in effect as well:

2.

Any given supplier
s
supplies a given part
p
to at most one project
j
.

As we saw earlier, then, the following JD captures the essence of the first of these rules and therefore holds

in SPJ′:

{ { SNO , PNO } , { PNO , JNO } , { JNO , SNO } }

Likewise, the following FD captures the essence of the second rule and therefore also holds in SPJ′:

{ SNO , PNO }
→
{ JNO }

From this FD, it follows that {SNO,PNO} is a key for SPJ′. Furthermore, it can be shown that no other FDs or JDs

hold, apart from trivial ones. Thus, since the foregoing JD certainly isn't implied by the sole key─the membership

algorithm fails─SPJ′ isn't in 5NF, though it
is
in BCNF (just as SPJ wasn't in 5NF but was in BCNF, and for

8
“Our” RFNF has since been renamed
ETNF
(“essential tuple normal form”). For further discussion, see the “Stop Press” in Appendix C.