Databases Reference
In-Depth Information
FunctionalDependenciesandFormsofNormalization
Before describing the method for normalizing a table some background information on normal forms
and functional dependencies is needed.
A functional dependency means that knowing the value of one attribute is sufficient to find the value of
another attribute. In relational algebra notation this is written as: X
Y . This translates as X determines
Y . This can be extended to something called mutivalue dependency. This is written as: X
→→
Y .This
translates as X determines several Ys .
Normalizing is the act of classifying a table based on its functional dependencies. The first three normal
forms can be informally understood knowing that all non-key columns are wholly dependent on the key.
Fourth normal form deals specifically with many-to-many and one-to-many relationships.
First Normal Form (1NF): This means that a table does not have any repeating groups. Another
way to state this is that every column contains singular (or atomic) values. Lists of values are not
allowed. The sample fails to satisfy this condition because the Frequency column contains a list
of values for each row.
Second Normal Form (2NF): This means that a table meets the requirements of 1NF and only
has full key dependencies. All the columns that compose the key for the table determine all the
non-key columns. Suppose a table had three columns that constituted the key for the table.
Second normal form means that all three key columns determine the non-key columns in the
table. Conversely, if two of the key columns also determine the non-key columns then a table
does not meet this requirement.
Third Normal Form (3NF): This means that none of the non-key columns of a table can depend
on any other non-key column. Therefore, all the non-key columns are independent of each other.
A non-key column cannot depend on another non-key column. There are no functional
dependencies between non-key columns.
Fourth Normal Form (4NF): This means that a table does not have more than one multi-valued
dependency. Multiple multi-valued dependencies are not allowed.
There are other normal forms, but this chapter focuses on the four core normal forms.
RulesofNormalization
Now that the explanation of functional dependencies and normal forms is behind you, it's time to
describe rules for normalizing a table. Earlier I stated that clear explanations are necessary for under-
standing normalization, and the problems it solves. Equally necessary is a clear process for normalizing
datasets. There are many explanations and methodologies that work well, yet are fairly complex or
difficult. However, there's one methodology that is clear, concise, and best of all easy to follow.
The idea is a simple process. First, seek to eliminate redundancy, both within rows and columns. Next,
endeavor to create all tables for a single purpose. Don't have multi-purpose tables. Simply apply these
rules to a dataset until all the tables fit the definition. Also, as you create tables give each table a key
column. That way you'll have a way to easily relate the tables together as you work through this process.
A few iterations may be necessary before you finish. That's OK; hang in there. Just keep iterating over
your tables until they're all single-purpose tables, and the redundancy has been eliminated or reduced.
As you'll see later in the section on denormalization, in some cases data redundancy is good to minimize
the number of joins.
Search WWH ::




Custom Search