Database Reference
In-Depth Information
4.3 Rationale for Normalization
Normalization is the process of ensuring that the database (conceptual schema) is
defined in such a manner as to ensure efficiency and ease of data access. Normalization
ensures the following:
•
Data integrity
•
Control of redundancy
•
Logical data independence
•
Avoidance of modification anomalies
The following problems can be experienced from having un-normalized files
in a system:
•
Data redundancy that leads to the modification anomalies
•
Modification anomalies which include:
¸
Insertion anomaly: Data cannot be inserted when it is
desirable; one has to wait on some future data, due to
organization of the data structure
¸
Deletion anomaly: Deletion of some undesirable aspect(s)
of data necessarily means deletion of some other desirable
aspect(s) of data
¸
Update anomaly: Update of some aspect(s) of data
necessarily means update of other aspect(s) of data
•
Inefficient file manipulation; lack of ease of data access
•
Inhibition to the achievement of logical data independence
•
Compromise on the integrity of data
•
Pressure on programming effort to make up for the poor design
Figure
4-1
indicates the six most commonly used
normal forms
. The hierarchy is
such that a relation in a given normal form is automatically in all normal forms prior to it.
Thus a relation in the second normal form (2NF) is automatically in the first normal form
(1NF); a relation in the third normal form (3NF) is in 2NF and so on. Edgar Frank Codd
defined the first three normal forms in the early 1970s; the Boyce-Codd normal form
(BCNF) was subsequently deduced from his work. The fourth and fifth normal forms
(4NF and 5NF) were subsequently defined by Ronald Fagin in the late 1970s.