Databases Reference
In-Depth Information
The problem with this schema is that inconsistencies can occur while the data is being
modified. Say Brown takes over as the head of the Accounting department. We need
to update multiple rows to reflect this change, and that's a pain and introduces oppor-
tunities for error. If the “Jones” row says the head of the department is something
different from the “Brown” row, there's no way to know which is right. It's like the old
saying, “A person with two watches never knows what time it is.” Furthermore, we
can't represent a department without employees—if we delete all employees in the
Accounting department, we lose all records about the department itself. To avoid these
problems, we need to normalize the table by separating the employee and department
entities. This process results in the following two tables for employees:
EMPLOYEE_NAME
DEPARTMENT
Jones
Accounting
Smith
Engineering
Brown
Accounting
Green
Engineering
and departments:
DEPARTMENT
HEAD
Accounting
Jones
Engineering
Smith
These tables are now in second normal form, which is good enough for many purposes.
However, second normal form is only one of many possible normal forms.
We're using the last name as the primary key here for purposes of illus-
tration, because it's the “natural identifier” of the data. In practice,
however, we wouldn't do that. It's not guaranteed to be unique, and it's
usually a bad idea to use a long string for a primary key.
Pros and Cons of a Normalized Schema
People who ask for help with performance issues are frequently advised to normalize
their schemas, especially if the workload is write-heavy. This is often good advice. It
works well for the following reasons:
• Normalized updates are usually faster than denormalized updates.
• When the data is well normalized, there's little or no duplicated data, so there's
less data to change.
• Normalized tables are usually smaller, so they fit better in memory and perform
better.
 
Search WWH ::




Custom Search