Database Reference
In-Depth Information
is the process of reversing the steps taken to achieve a normal form. Often,
it becomes necessary to violate certain normalization rules to satisfy the
real-world requirements of specific queries. Let's look at some examples.
In data models that have a completely normalized structure, there
tend to be a great many entities and relationships. To retrieve logical sets
of data, you often need a great many joins to retrieve all the pertinent in-
formation about a given object. Logically this is not a problem, but in the
physical implementation of a database, joins tend to incur overhead in
query processing time. For every table that is joined, there is usually a cost
to scan the indexes on that table and then retrieve the matching data from
each object, combine the resulting data, and deliver it to the end user (for
more on indexes and query optimization, see Chapter 10).
When millions of rows are being scanned and tens or hundreds of rows
are being returned, it is costly. In these situations, creating a denormalized
entity may offer a performance benefit, at the cost of violating one of the
normal forms. The trade-off is usually a matter of having redundant data,
because you are storing an additional physical table that duplicates data
being stored in other tables. To mitigate the storage effects of this tech-
nique, you can often store subsets of data in the duplicate table, clearing it
out and repopulating it based on the queries you know are running against
it. Additionally, this means that you have additional physical objects to
maintain if there are schema changes in the original tables. In this case, ac-
curate documentation and a managed change control process are the only
practices that can ensure that all the relevant denormalized objects stay in
sync.
Denormalization also can help when you're working on reporting ap-
plications. In larger environments, it is often necessary to generate reports
based on application data. Reporting queries often return large historical
data sets, and when you join various types of data in a single report it in-
curs a lot of overhead on standard OLTP systems. Running these queries
on exactly the same databases that the applications are trying to use can re-
sult in an overloaded system, creating blocking situations and causing end
users to wait an unacceptable amount of time for the data. Additionally, it
means storing large amounts of historical data in the OLTP system, some-
thing that may have other adverse effects, both internally to the database
management system and to the physical server resources.
Denormalizing the data in the database to a set of tables (or even to a
different physical database) specifically used for reporting can alleviate the
Search WWH ::




Custom Search