The Relational Model and Normalization - Database Processing: Fundamentals, Design, and Implementation

Database Reference

In-Depth Information

table meeting the conditions in Figure 3-4 is therefore a relation in 1NF. Codd also noted that some

tables (or, interchangeably in this topic, relations) in 1NF had modification anomalies. He found

that he could remove some of those anomalies by applying certain conditions. A relation that met

those conditions, which we will discuss later in this chapter, was said to be in second normal

form (2NF) . He also observed, however, that relations in 2NF could also have anomalies, and so he

defined third normal form (3NF) , which is a set of conditions that removes even more anomalies

and which we will also discuss later in this chapter. As time went by, other researchers found still

other ways that anomalies can occur, and the conditions for Boyce-Codd Normal Form (BCNF)

were defined.

These normal forms are defined so that a relation in BCNF is in 3NF, a relation in 3NF is in

2NF, and a relation in 2NF is in 1NF. Thus, if you put a relation into BCNF, it is automatically in

the lesser normal forms.

Normal forms 2NF through BCNF concern anomalies that arise from functional depen-

dencies. Other sources of anomalies were found later. They led to the definition of fourth

normal form (4NF) and fifth normal form (5NF) , both of which we will discuss later in this

chapter. So it went, with researchers chipping away at modification anomalies, each one im-

proving on the prior normal form.

In 1982, Fagin published a paper that took a different tack. 5 Instead of looking for just

another normal form, Fagin asked, “What conditions need to exist for a relation to have no

anomalies?” In that paper, he defined domain/key normal form (DK/NF) . Fagin ended the

search for normal forms by showing that a relation in DK/NF has no modification anomalies

and, further, that a relation that has no modification anomalies is in DK/NF. DK/NF is dis-

cussed in more detail later in this chapter.

Normalization Categories

As shown in Figure 3-12, normalization theory can be divided into three major categories.

Some anomalies arise from functional dependencies, some arise from multivalued dependen-

cies, and some arise from data constraints and odd conditions.

BCNF, 3NF, and 2NF are all concerned with anomalies that are caused by functional de-

pendencies. A relation that is in BCNF has no modification anomalies from functional depen-

dencies. It is also automatically in 2NF and 3NF, and, therefore, we will focus on transforming

relations into BCNF. However, it is instructive to work through the progression of normal

forms from 1NF to BCNF in order to understand how each normal form deals with specific

anomalies, and we will do this later in this chapter. 6

As shown in the second row of Figure 3-12, some anomalies arise because of another kind

of dependency called a multivalued dependency. Those anomalies can be eliminated by plac-

ing each multivalued dependency in a relation of its own, a condition known as 4NF. You will

see how to do that in the last section of this chapter.

Figure 3-12

Summary of Normalization

theory

Source of Anomaly

Normal Forms

Design Principles

Functional dependencies

1NF, 2NF,

3NF, BCNF

BCNF: Design tables so that every

determinant is a candidate key.

Multivalued dependencies

4NF

4NF: Move each multivalued

dependency to a table of its own.

Data constraints and oddities

5NF, DK/NF

DK/NF: Make every constraint a

logical consequence of candidate

keys and domains.

5 R. Fagin, “A Normal Form for Relational Databases That Is Based on Domains and Keys,” ACM Transactions on

Database Systems , September 1981, pp. 387-414.

6 See C. J. Date, An Introduciton to Database Systems, 8th ed. (New York: Addison-Wesley, 2003) for a complete

discussion of normal forms.

Database Processing: Fundamentals, Design, and Implementation

Search WWH ::

Custom Search

Home