Databases Reference

In-Depth Information

Conventional wisdom has it that FD preservation is relevant only to the step from 3NF to BCNF, but as

we've seen such isn't necessarily the case.

The FDs typically suggested by the conventional procedure as the basis for decomposition aren't necessarily

the best ones to use.

That procedure also assumes the best design can be found by stepping from 1NF to 2NF to 3NF (etc.) in

sequence.

Of course, the very nomenclature of “first,” “second,” etc. reinforces this last perception ... but that

nomenclature is really nothing more than a historical accident, in a way. I mean, if the first of the normal forms to

be defined had been BCNF—which it easily could have been, since the definition is so conceptually simple,

involving as it does no mention of FD irreducibility, nonkey attributes, subkeys, 1NF, 2NF, or 3NF—then there

would really never have been any need to call out 2NF and 3NF as specific normal forms, as such, at all.
8

A PROCEDURE THAT WORKS

Here now is a procedure that's guaranteed to produce a decomposition in which all relvars are in 3NF (though not

necessarily BCNF) and all FDs are preserved.
9
For convenience, I'll refer to it in what follows as
the 3NF

procedure
. The input is a relvar
R
and what's called an
irreducible cover
,
C
say, for the FDs that hold in
R
. I'll

explain what an irreducible cover is in a few moments—by the way, there's that word
irreducible
again—but let me

state the procedure first:

1.

Let
S
be a set of headings. Initialize
S
to the empty set, {}.

2.

Let
X
be the left side (the determinant) of some FD in
C
; let the complete set of FDs in
C
with left side
X
be

X
→
Y1
, ...,
X
→
Yn
; and let the union of
Y1
, ...
Yn
be
Y
. Add the union of
X
and
Y
to
S
. Perform this step for

each distinct
X
.

3.

Let
U
be the set of attributes of
R
not contained in any element of
S
. If
U
is nonempty, add
U
to
S
.

4.

If no element of
S
is a superkey for
R
, add some key
K
of
R
to
S
.

At the conclusion of this procedure, the elements of
S
are the headings of a set of 3NF relvars into which
R
can be

nonloss decomposed without losing any FDs. Note in particular that the procedure makes no explicit mention of

2NF, not even as some kind of stepping stone.

So how does it work? Clearly, the notion of an irreducible cover is important. In order to explain that

notion, let me first call out something I've appealed to several times already in passing: namely, the fact that
some

FDs imply others
. As a simple example, the FDs
X
→
Y
and
Y
→
Z
together imply the FD
X
→
Z
─by which I

mean, if the first two FDs are satisfied by relation
r
, then the third one must be satisfied as well. Or, perhaps more

to the point: If the first two hold in relvar
R
, then the third one must hold as well. We saw an illustration in the

8
In support of this contention, I'd like to quote something Codd himself had to say in the paper in which he introduced 2NF and 3NF (see

Appendix C): “The basic ideas underlying [these] normal forms are simple, but they have many subtle ramifications. The author has found that

numerous examples are needed to explain and motivate the precise definitions of these normal forms.”

9
I give this procedure partly for historical reasons. You can skip it if you like.