Database Reference
In-Depth Information
Balancing Under- and Overnormalization
When designing a database, you have the following two extreme options:
Save the complete data in a single, flat table with little to no normalization.
Save the data in fine-grained tables by exploding every attribute into its own table and thus
allowing every attribute to save an unlimited number of multiple values.
Don't get me wrong. There are excellent places where you can put NoSQL-style databases to work using an
ID-value pair mechanism for storage and even retrieval. But, there is a need for relational storage. Reasonable
normalization enhances database performance. The presence of wide tables with a large number of columns is
usually a characteristic of an undernormalized database. Undernormalization causes excessive repetition of data,
which can lead to improper results and often hurts query performance. For example, in an ordering system, you can
keep a customer's profile and all the orders placed by the customer in a single table, as shown in Table 26-1 .
Table 26-1. Original Customers Table
CustID
Name
Address
Phone
OrderDt
ShippingAddress
100
100
Liu Hong
Liu Hong
Boise, ID, USA
Boise, ID, USA
123-456-7890
123-456-7890
08-Jul-04
10-Jul-04
Boise, ID, USA
Austin, TX, USA
Keeping the customer profile and the order information together in a single table will repeat the customer profile
in every order placed by the customer, making the rows in the table very wide. Consequently, fewer customer profiles
can be saved in one data page. For a query interested in a range of customer profiles (not their order information),
more pages have to be read compared to a design in which customer profiles are kept in a separate table. Also,
with every bit of data in one large table, you're going to see a lot more locking and concurrency issues since more
people are going to access the same data out of the same page or row much more frequently. This is especially true
because you'll be storing fewer rows of data on each page because they're wider. To avoid the performance impact of
undernormalization, you must normalize the two logical entities (for example, customer profile and orders), which
have a one-to-many type of relationship, into separate tables, as shown in Tables 26-2 and 26-3 .
Table 26-2. New Customers Table
CustID
Name
Address
Phone
100
Liu Hong
Boise, ID, USA
123-456-7890
Table 26-3. Orders Table
CustID
OrderDt
ShippingAddress
100
08-Jul-04
Boise, ID, USA
100
10-Jul-04
Austin, TX, USA
Yes, there are further normalization opportunities possible with these tables; however, that's up to you, working
with your business, to determine whether they're needed.
 
 
Search WWH ::




Custom Search