Databases Reference
In-Depth Information
Aside:
It might be argued, at least in the SNC and CT example, that the decomposition also means there's a
constraint that now doesn't have to be maintained: viz., the FD {CITY} → {STATUS}. But this argument
isn't entirely valid—all the decomposition does, in this respect, is move that constraint from one relvar to
another (actually from relvar S to relvar CT, where it's maintained as a side effect of maintaining the
constraint that {CITY} is a key).
End of aside.
Now, in the simple example under discussion, the benefits of doing the decomposition almost certainly
outweigh the benefits of not doing so. But such is not always the case; indeed, the question of whether or not to
decompose, in more complicated situations, can be a fairly vexing one. In what follows, in order to avoid a lot of
repetitive text, I'll tend to assume we do always want to do the decomposition—but please don't forget there can
sometimes be persuasive arguments for not doing so, especially in examples more complex than the one at hand,
such as are discussed in Part III of this topic.
CONCLUDING REMARKS
I'd like to close this chapter by addressing a question I haven't discussed in this topic at all so far. It's a matter of
terminology. To be specific: Why are 1NF, 2NF, and the rest called normal forms, anyway? Come to that, why is
normalization called normalization?
The answers to these questions derive from mathematics (though the ideas spill over into several related
disciplines, including the computing discipline in particular—think of floating point numbers, for example). In
mathematics, we often find ourselves having to deal with some large, possibly even infinite, set of objects of some
kind: for example, the set of all matrices, or the set of all rational numbers, or—coming a little closer to home—the
set of all relations. In such a situation, it's desirable to find a set of
canonical forms
for the objects in question.
Here's a definition:
Definition:
Given a set
s1
, together with a defined notion of equivalence among elements of that set, subset
s2
of
s1
is
a set of canonical forms
for
s1
if and only if every element
x1
of
s1
is equivalent to just one
element
x2
of
s2
under that notion of equivalence (and that element
x2
is
the canonical form
for the element
x1
).
12
Various “interesting” properties that apply to
x1
also apply to
x2
; thus, we can study just the small set
s2
, not the large set
s1
, in order to prove a variety of “interesting” theorems or results.
As a trivial illustration of this notion, let
s1
be the set of nonnegative integers {0,1,2,...}, and let two such
integers be equivalent if and only if they leave the same remainder on division by five. Then we can define
s2
to be
the set {0,1,2,3,4}. As for an “interesting” theorem that applies in this example, let
x1
,
y1
, and
z1
be any three
elements of
s1
(i.e., any three nonnegative integers), and let their canonical forms in
s2
be
x2
,
y2
, and
z2
,
respectively; then the product
y1
*
z1
is equivalent to
x1
if and only if the product
y2
*
z2
is equivalent to
x2
.
12
It's reasonable to require also that every element
x2
of
s2
be equivalent to at least one element
x1
of
s1
. Let me also draw your attention to the
following remarks, paraphrased from the answer to Exercise 2.3 in Appendix D: Throughout this topic, expressions of the form “
B
is a subset of
A
” must be understood to include the possibility that
B
and
A
might be equal. For example, the set {
x
,
y
,
z
} is a subset of itself. When I want to
exclude such a possibility, I'll talk explicitly in terms of
proper
subsets; for example, the set {
x
,
z
} is a proper subset of the set {
x
,
y
,
z
}. Of course,
no set is a proper subset of itself.