Databases Reference

In-Depth Information

By way of an example of a relvar that's in 3NF but not BCNF, consider a revised version of the shipments

relvar—let's call it SNP─that has an additional attribute SNAME, representing the name of the applicable supplier.

Suppose also that supplier names are necessarily unique (i.e., no two suppliers ever have the same name at the same

time). Here then are some sample tuples:

┌─────┬───────┬─────┬─────┐

SNP │ SNO │ SNAME │ PNO │ QTY │

├─────┼───────┼─────┼─────┤

│ S1 │ Smith │ P1 │ 300 │

│ S1 │ Smith │ P2 │ 200 │

│ S1 │ Smith │ P3 │ 400 │

│ .. │ ..... │ .. │ ... │

│ S2 │ Jones │ P1 │ 300 │

│ S2 │ Jones │ P2 │ 400 │

│ .. │ ..... │ .. │ ... │

Once again we observe some redundancy: Every tuple for supplier S1 tells us S1 is named Smith, every

tuple for supplier S2 tells us S2 is named Jones, and so on; likewise, every tuple for Smith tells us Smith's supplier

number is S1, every tuple for Jones tells us Jones's supplier number is S2, and so on. And the relvar isn't in BCNF.

First of all, it has two keys, {SNO,PNO} and {SNAME,PNO}.
8
Second, every subset of the
heading—{QTY} in

particular—is (of course) functionally dependent on both of those keys. Third, however, the FDs {SNO} →

{SNAME} and {SNAME} → {SNO} also hold; these FDs are certainly not trivial, nor are they arrows out of

superkeys, and so the relvar isn't in BCNF (though it is in 3NF).

Finally, as I'm sure you know, the normalization discipline says: If relvar
R
isn't in BCNF, then decompose

it into projections that are. In the case of relvar SNP, either of the following decompositions will meet this

objective:

Projecting on {SNO,SNAME} and {SNO,PNO,QTY}

Projecting on {SNO,SNAME} and {SNAME,PNO,QTY}

By the way, I can now explain why BCNF is the odd man out, as it were, in not having a name of the form

“
n
th normal form.” I quote from the paper in which Codd first described this new normal form:
9

More recently, Boyce and Codd developed the following definition: A [relvar] R is in third normal form if it is in first

normal form and, for every attribute collection C of R, if any attribute not in C is functionally dependent on C, then all

attributes in R are functionally dependent on C [
i.e., C is a superkey
].

So Codd was giving here what he regarded as a “new and improved” definition of
third
normal form. The

trouble was, the new definition was, and is, strictly stronger than the old one; that is, any relvar that's in 3NF by the

new definition is certainly in 3NF by the old one, but the converse isn't true—a relvar can be in 3NF by the old

definition and not in 3NF by the new one (relvar SNP, discussed above, is a case in point). So what that “new and

improved” definition defined was really a new and stronger normal form, which therefore needed a distinctive name

8
That's why I didn't show any double underlining when I showed the sample tuples─there are two candidate keys, and there doesn't seem to be

any good reason to make either of them “more equal than the other.”

9
E. F. Codd: “Recent Investigations into Relational Data Base Systems,” Proc. IFIP Congress, Stockholm, Sweden (1974).