Databases Reference
In-Depth Information
4.3.4 From Online Analytical Processing
to Multidimensional Data Mining
The data mining field has conducted substantial research regarding mining on vari-
ous data types, including relational data, data from data warehouses, transaction data,
time-series data, spatial data, text data, and flat files. Multidimensional data mining
(also known as exploratory multidimensional data mining , online analytical mining ,
or OLAM ) integrates OLAP with data mining to uncover knowledge in multidimen-
sional databases. Among the many different paradigms and architectures of data mining
systems, multidimensional data mining is particularly important for the following
reasons:
High quality of data in data warehouses: Most data mining tools need to work on
integrated, consistent, and cleaned data, which requires costly data cleaning, data
integration, and data transformation as preprocessing steps. A data warehouse con-
structed by such preprocessing serves as a valuable source of high-quality data for
OLAP as well as for data mining. Notice that data mining may serve as a valuable
tool for data cleaning and data integration as well.
Available information processing infrastructure surrounding data warehouses:
Comprehensive information processing and data analysis infrastructures have been
or will be systematically constructed surrounding data warehouses, which include
accessing, integration, consolidation, and transformation of multiple heterogeneous
databases, ODBC/OLEDB connections, Web accessing and service facilities, and
reporting and OLAP analysis tools. It is prudent to make the best use of the available
infrastructures rather than constructing everything from scratch.
OLAP-based exploration of multidimensional data: Effective data mining needs
exploratory data analysis. A user will often want to traverse through a database, select
portions of relevant data, analyze them at different granularities, and present knowl-
edge/results in different forms. Multidimensional data mining provides facilities for
mining on different subsets of data and at varying levels of abstraction—by drilling,
pivoting, filtering, dicing, and slicing on a data cube and/or intermediate data min-
ing results. This, together with data/knowledge visualization tools, greatly enhances
the power and flexibility of data mining.
Online selection of data mining functions: Users may not always know the specific
kinds of knowledge they want to mine. By integrating OLAP with various data min-
ing functions, multidimensional data mining provides users with the flexibility to
select desired data mining functions and swap data mining tasks dynamically.
Chapter 5 describes data warehouses on a finer level by exploring implementation
issues such as data cube computation, OLAP query answering strategies, and multi-
dimensional data mining. The chapters following it are devoted to the study of data
mining techniques. As we have seen, the introduction to data warehousing and OLAP
technology presented in this chapter is essential to our study of data mining. This
is because data warehousing provides users with large amounts of clean, organized,
 
Search WWH ::




Custom Search