Database Reference
In-Depth Information
table meeting the conditions in Figure 3-4 is therefore a relation in 1NF. Codd also noted that some
tables (or, interchangeably in this topic, relations) in 1NF had modification anomalies. He found
that he could remove some of those anomalies by applying certain conditions. A relation that met
those conditions, which we will discuss later in this chapter, was said to be in second normal
form (2NF) . He also observed, however, that relations in 2NF could also have anomalies, and so he
defined third normal form (3NF) , which is a set of conditions that removes even more anomalies
and which we will also discuss later in this chapter. As time went by, other researchers found still
other ways that anomalies can occur, and the conditions for Boyce-Codd Normal Form (BCNF)
were defined.
These normal forms are defined so that a relation in BCNF is in 3NF, a relation in 3NF is in
2NF, and a relation in 2NF is in 1NF. Thus, if you put a relation into BCNF, it is automatically in
the lesser normal forms.
Normal forms 2NF through BCNF concern anomalies that arise from functional depen-
dencies. Other sources of anomalies were found later. They led to the definition of fourth
normal form (4NF) and fifth normal form (5NF) , both of which we will discuss later in this
chapter. So it went, with researchers chipping away at modification anomalies, each one im-
proving on the prior normal form.
In 1982, Fagin published a paper that took a different tack. 5 Instead of looking for just
another normal form, Fagin asked, “What conditions need to exist for a relation to have no
anomalies?” In that paper, he defined domain/key normal form (DK/NF) . Fagin ended the
search for normal forms by showing that a relation in DK/NF has no modification anomalies
and, further, that a relation that has no modification anomalies is in DK/NF. DK/NF is dis-
cussed in more detail later in this chapter.
Normalization Categories
As shown in Figure 3-12, normalization theory can be divided into three major categories.
Some anomalies arise from functional dependencies, some arise from multivalued dependen-
cies, and some arise from data constraints and odd conditions.
BCNF, 3NF, and 2NF are all concerned with anomalies that are caused by functional de-
pendencies. A relation that is in BCNF has no modification anomalies from functional depen-
dencies. It is also automatically in 2NF and 3NF, and, therefore, we will focus on transforming
relations into BCNF. However, it is instructive to work through the progression of normal
forms from 1NF to BCNF in order to understand how each normal form deals with specific
anomalies, and we will do this later in this chapter. 6
As shown in the second row of Figure 3-12, some anomalies arise because of another kind
of dependency called a multivalued dependency. Those anomalies can be eliminated by plac-
ing each multivalued dependency in a relation of its own, a condition known as 4NF. You will
see how to do that in the last section of this chapter.
Figure 3-12
Summary of Normalization
theory
Source of Anomaly
Normal Forms
Design Principles
Functional dependencies
1NF, 2NF,
3NF, BCNF
BCNF: Design tables so that every
determinant is a candidate key.
Multivalued dependencies
4NF
4NF: Move each multivalued
dependency to a table of its own.
Data constraints and oddities
5NF, DK/NF
DK/NF: Make every constraint a
logical consequence of candidate
keys and domains.
5 R. Fagin, “A Normal Form for Relational Databases That Is Based on Domains and Keys,” ACM Transactions on
Database Systems , September 1981, pp. 387-414.
6 See C. J. Date, An Introduciton to Database Systems, 8th ed. (New York: Addison-Wesley, 2003) for a complete
discussion of normal forms.
 
 
Search WWH ::




Custom Search