Databases Reference
In-Depth Information
Operational metadata , which include data lineage (history of migrated data and the
sequence of transformations applied to it), currency of data (active, archived, or
purged), and monitoring information (warehouse usage statistics, error reports, and
audit trails).
The algorithms used for summarization , which include measure and dimension
definition algorithms, data on granularity, partitions, subject areas, aggregation,
summarization, and predefined queries and reports.
Mapping from the operational environment to the data warehouse , which includes
source databases and their contents, gateway descriptions, data partitions, data
extraction, cleaning, transformation rules and defaults, data refresh and purging
rules, and security (user authorization and access control).
Data related to system performance , which include indices and profiles that improve
data access and retrieval performance, in addition to rules for the timing and
scheduling of refresh, update, and replication cycles.
Business metadata , which include business terms and definitions, data ownership
information, and charging policies.
A data warehouse contains different levels of summarization, of which metadata is one.
Other types include current detailed data (which are almost always on disk), older
detailed data (which are usually on tertiary storage), lightly summarized data, and highly
summarized data (which may or may not be physically housed).
Metadata play a very different role than other data warehouse data and are important
for many reasons. For example, metadata are used as a directory to help the decision
support system analyst locate the contents of the data warehouse, and as a guide to
the data mapping when data are transformed from the operational environment to the
data warehouse environment. Metadata also serve as a guide to the algorithms used for
summarization between the current detailed data and the lightly summarized data, and
between the lightly summarized data and the highly summarized data. Metadata should
be stored and managed persistently (i.e., on disk).
4.2 Data Warehouse Modeling: Data Cube
and OLAP
Data warehouses and OLAP tools are based on a multidimensional data model . This
model views data in the form of a data cube . In this section, you will learn how data cubes
model n -dimensional data (Section 4.2.1). In Section 4.2.2, various multidimensional
models are shown: star schema, snowflake schema, and fact constellation. You will also
learn about concept hierarchies (Section 4.2.3) and measures (Section 4.2.4) and how
they can be used in basic OLAP operations to allow interactive mining at multiple levels
of abstraction. Typical OLAP operations such as drill-down and roll-up are illustrated
 
Search WWH ::




Custom Search