Databases Reference
In-Depth Information
Aside: It might be argued, at least in the SNC and CT example, that the decomposition also means there's a
constraint that now doesn't have to be maintained: viz., the FD {CITY} → {STATUS}. But this argument
isn't entirely valid—all the decomposition does, in this respect, is move that constraint from one relvar to
another (actually from relvar S to relvar CT, where it's maintained as a side effect of maintaining the
constraint that {CITY} is a key). End of aside.
Now, in the simple example under discussion, the benefits of doing the decomposition almost certainly
outweigh the benefits of not doing so. But such is not always the case; indeed, the question of whether or not to
decompose, in more complicated situations, can be a fairly vexing one. In what follows, in order to avoid a lot of
repetitive text, I'll tend to assume we do always want to do the decomposition—but please don't forget there can
sometimes be persuasive arguments for not doing so, especially in examples more complex than the one at hand,
such as are discussed in Part III of this topic.
I'd like to close this chapter by addressing a question I haven't discussed in this topic at all so far. It's a matter of
terminology. To be specific: Why are 1NF, 2NF, and the rest called normal forms, anyway? Come to that, why is
normalization called normalization?
The answers to these questions derive from mathematics (though the ideas spill over into several related
disciplines, including the computing discipline in particular—think of floating point numbers, for example). In
mathematics, we often find ourselves having to deal with some large, possibly even infinite, set of objects of some
kind: for example, the set of all matrices, or the set of all rational numbers, or—coming a little closer to home—the
set of all relations. In such a situation, it's desirable to find a set of canonical forms for the objects in question.
Here's a definition:
Definition: Given a set s1 , together with a defined notion of equivalence among elements of that set, subset
s2 of s1 is a set of canonical forms for s1 if and only if every element x1 of s1 is equivalent to just one
element x2 of s2 under that notion of equivalence (and that element x2 is the canonical form for the element
x1 ). 12 Various “interesting” properties that apply to x1 also apply to x2 ; thus, we can study just the small set
s2 , not the large set s1 , in order to prove a variety of “interesting” theorems or results.
As a trivial illustration of this notion, let s1 be the set of nonnegative integers {0,1,2,...}, and let two such
integers be equivalent if and only if they leave the same remainder on division by five. Then we can define s2 to be
the set {0,1,2,3,4}. As for an “interesting” theorem that applies in this example, let x1 , y1 , and z1 be any three
elements of s1 (i.e., any three nonnegative integers), and let their canonical forms in s2 be x2 , y2 , and z2 ,
respectively; then the product y1 * z1 is equivalent to x1 if and only if the product y2 * z2 is equivalent to x2 .
12 It's reasonable to require also that every element x2 of s2 be equivalent to at least one element x1 of s1 . Let me also draw your attention to the
following remarks, paraphrased from the answer to Exercise 2.3 in Appendix D: Throughout this topic, expressions of the form “ B is a subset of
A ” must be understood to include the possibility that B and A might be equal. For example, the set { x , y , z } is a subset of itself. When I want to
exclude such a possibility, I'll talk explicitly in terms of proper subsets; for example, the set { x , z } is a proper subset of the set { x , y , z }. Of course,
no set is a proper subset of itself.
Search WWH ::

Custom Search