Database Reference
In-Depth Information
When I turned 12, I received a trunk full of baseball cards as a birthday present from my
parents. I was delighted, not just because there might have been a Hank Aaron or Pete Rose
buried somewhere in that trunk, but because I loved to organize the cards. I categorized
each card according to year and team. Organizing the cards in this way gave me a deep
understanding of the players and their teams. To this day, I can answer many baseball trivia
questions.
Normalization, in general, is the process of applying a set of rules with the goal of organ-
izing something . I was normalizing the baseball cards according to year and team. We can
also apply a set of rules and normalize the attributes within our organizations. The rules are
based on how the business works, and the relational data model captures the business rules,
which is why normalization is the primary technique used in building the relational logical
data model.
Just as those baseball cards initially lay unsorted in that trunk, our companies have huge
numbers of attributes spread throughout departments and applications. The rules applied
to normalizing the baseball cards entailed first sorting by year and then by team within
a year. The rules for normalizing our attributes can be boiled down to a single sentence:
Make sure every attribute is single-valued and provides a fact completely and only about
its primary key. Single-valued means an attribute must contain only one piece of inform-
ation. If Consumer Name contains Consumer First Name and Consumer Last Name ,
for example, we must split Consumer Name into two attributes - Consumer First Name
and Consumer Last Name . Provides a fact means that a given primary key value will al-
ways return no more than one of every attribute that is in the same entity with this key. If
a Customer Identifier value of 123 , for example, returns three customer last names ( Smith ,
Jones , and Roberts ), Customer Last Name violates this part of the normalization defini-
tion. Completely means that the minimal set of attributes that uniquely identify an instance
of the entity is present in the primary key. Only means that each attribute must provide a
fact about the primary key and nothing else. That is, there can be no hidden dependencies,
so we would need to remove derived attributes.
We cannot determine if every attribute is single-valued and provides a fact completely and
only about its primary key unless we understand the data. To understand the data, we usu-
ally need to ask lots of questions. Therefore a second definition for normalization is: A
formal process of asking business questions. Even for an apparently simple attribute such
as Phone Number , for example, we can ask many questions:
Whose phone number is this?
Search WWH ::




Custom Search