Database Reference
In-Depth Information
Here is an alternate definition: A data warehouse is a relational database that is
typically constructed from multiple transactional databases (called source databases ),
and designed for query and analysis rather than transaction processing. The data
warehouse usually contains historical data that is derived from transaction data from
multiple sources. It separates analysis workload from transaction workload, and enables a
business to consolidate data from several sources.
In addition to a relational database, a data warehouse environment often consists
of an ETL (extract, transformation, and load) solution, an OLAP (on-line analytical
processing) engine, client analysis tools, and other applications that manage the process
of gathering data and delivering it to business users. End users typically require some
kind of catalog that describes the data in its business context, and acts as a guide to the
location and use of this information. Finally, end users require a set of tools to analyze
and manipulate the information thus made available.
As you are no doubt aware by now, achieving comprehensiveness and consistency
of data in today's business environment is often a complex and challenging undertaking.
This is also true for a data warehouse. The following steps are necessary for the
construction of a data warehouse:
1.
Conduct an information infrastructure analysis to determine
the required structure of the data warehouse.
2.
Identify the source databases that will feed the data warehouse.
3.
Design the integrated logical data model and determine the
architecture of the data warehouse.
4.
Develop and implement a comprehensive meta-data
methodology.
5.
Determine, and then implement the physical structure of the
data warehouse.
6.
Design and implement an integrated staging area for the
data warehouse.
7.
Extract, transform and load the data (from various sources)
into the data warehouse. This involves first cleansing the
source data of various structural and content errors.
8.
Conduct comprehensive post-implementation review(s) to
ensure that the data warehouse is performing acceptably.
9.
Maintain the data warehouse.
Data mining is the act of extracting data/information from assorted sources and
presenting information in a manner that is consistent with user requirement. Data
mining often implies the existence of data warehouses, so the two terms are closely
related. Another related term is information extraction (IE) —the extraction of structured
information from unstructured text. IE sometimes involve access of data warehouse(s)
either as the source or destination of information.
 
Search WWH ::




Custom Search