Database Reference
In-Depth Information
perspectives. For example, a time dimension can be used to analyze changes in
sales over various periods of time, whereas a location dimension can be used to
analyze sales according to the geographical distribution of stores. Users may
combine several analysis perspectives (i.e., dimensions) according to their
needs. For example, a user may require information about sales of computer
accessories (the product dimension) in July 2012 (the time dimension) at all
store locations (the store dimension). Dimensions typically include attributes
that form hierarchies , which allow users to explore measures at various
levels of detail. Examples of hierarchies are month-quarter-year in the time
dimension and city-state-country in the location dimension. Aggregation of
measures takes place when a hierarchy is traversed. For example, moving in
a hierarchy from a month level to a year level will yield aggregated values of
sales for the various years.
From a methodological point of view, data warehouses must be designed
analogously to operational databases, that is, following the four-step process
consisting of requirements specification and conceptual, logical, and physical
design. However, there is still no widely accepted conceptual model for
data warehouse applications. Due to this, data warehouse design is usually
performed at the logical level, leading to schemas that are dicult to
understand by a typical user. We believe that a conceptual model that clearly
stands on top of the logical level is required for data warehouse design. In this
topic, we use the MultiDim model , which is powerful enough to represent
the complex characteristics of data warehouses at an abstraction level higher
than the logical model. We study conceptual modeling for data warehouses
in Chap. 4 .
At the logical level , the multidimensional model is usually represented by
relational tables organized in specialized structures called star schemas and
snowflake schemas. These relational schemas relate a fact table to several
dimension tables. Star schemas use a unique table for each dimension,
even in the presence of hierarchies, which yields denormalized dimension
tables. On the other hand, snowflake schemas use normalized tables for
dimensions and their hierarchies. Then, over this relational representation
of a data warehouse, an OLAP server builds a data cube, which provides a
multidimensional view of the data warehouse. Logical modeling is studied in
Chap. 5 .
Once a data warehouse has been implemented, analytical queries must
be addressed to it. MDX (MultiDimensional eXpressions) is the de facto
standard language for querying a multidimensional database, although it
can also be used in the definition of data cubes. Thus, MDX provides a
functionality for multidimensional databases similar to the one provided by
SQL (Structured Query Language) for traditional relational databases. The
MDX language is studied (and compared to SQL) in Chap. 6 .
The physical level is concerned with implementation issues. Given the
size of a typical data warehouse, physical design is crucial to ensure adequate
response time to the complex ad hoc queries that must be supported. Three
Search WWH ::




Custom Search