Databases Reference
In-Depth Information
Another main difference between OLTP databases and data warehouses is the degree of
normalization found in them. An OLTP system uses normalized databases, usually at a
third normal form, while a data warehouse uses a denormalized dimensional model. An
OLTP normalized model helps to remove data redundancies,and focus on data integrity;
it benefits update operations as data needs to be updated in one place only. On the other
hand, a data warehouse dimensional model is more appropriate for ad hoc complex
queries, and will usually have fewer tables and require fewer joins.
Dimensional data modeling on data warehouses relies on the use of fact and dimension
tables. Fact tables contain facts or numerical measures of the business, which can
participate in calculations, while dimension tables are the attributes or descriptions of
the facts. Fact tables also usually have foreign keys to link them to the primary keys of the
dimension tables.
Data warehouses also usually follow star and snowflake schema structures. A star
schema contains a fact table and a single table for each dimension. Snowflake schemas
are similar to star schemas to the extent that they also have a fact table but, in addition,
dimension tables can also be normalized, and each dimension can have more than one
table. Fact tables are typically huge and can store millions or billions of rows, compared
to dimension tables, which are significantly smaller. The size of data warehouse databases
tends to range from hundreds of gigabytes to terabytes.
SQL Server sample databases includes AdventureWorksDW, a data warehouse database
whose purpose is to demonstrate the SQL Server business intelligence features. The
AdventureWorksDW database will be used for the example in this section.
Queries that join a fact table to dimension tables are called star join queries. SQL Server
includes special optimizations for star join queries (which we'll look at shortly), can
automatically detect star and snowflake schemas, and can reliably identify fact and
dimension tables. This is significant because sometimes, in order to avoid the overhead of
constraint enforcement during updates, data warehouse implementations don't explicitly
define foreign key constraints. In these cases, the Query Optimizer may need to rely on
heuristics to detect star schemas.
Search WWH ::




Custom Search