Database Reference
In-Depth Information
1. Trends in Data Mining and Knowledge
Discovery
Krzysztof J. Cios 1,2,3 and Lukasz A. Kurgan 4
1
University of Colorado at Denver and Health Sciences Center, Department
of Computer Science and Engineering, Campus Box 109, Denver, CO
80217-3364, U.S.A.; email: Krys.Cios@cudenver.edu
2
University of Colorado at Boulder, Department of Computer Science,
Boulder, CO, U.S.A.;
3
4cData, LLC, Golden, CO 80401
4 University of Alberta, Department of Electrical and Computer Engineering,
ECERF 2nd floor, Edmonton, AB T6G 2V4, Canada;
email: lkurgan@ece.ualberta.ca
Data mining and knowledge discovery (DMKD) is a fast-growing field of research.
Its popularity is caused by an ever increasing demand for tools that help in
revealing and comprehending information hidden in huge amounts of data. Such
data are generated on a daily basis by federal agencies, banks, insurance
companies, retail stores, and on the WWW. This explosion came about through the
increasing use of computers, scanners, digital cameras, bar codes, etc. We are in a
situation where rich sources of data, stored in databases, warehouses, and other
data repositories, are readily available but not easily analyzable. This causes
pressure from the federal, business, and industry communities for improvements
in the DMKD technology. What is needed is a clear and simple methodology for
extracting the knowledge hidden in the data. In this chapter, an integrated DMKD
process model based on technologies like XML, PMML, SOAP, UDDI, and OLE
BD-DM is introduced. These technologies help to design flexible, semiautomated,
and easy-to-use DMKD models to enable building knowledge repositories and
allowing for communication between several data mining tools, databases, and
knowledge repositories. They also enable integration and automation of the
DMKD tasks. This chapter describes a six-step DMKD process model and its
component technologies.
1.1 Knowledge Discovery and Data Mining Process
Knowledge discovery (KD) is a nontrivial process of identifying valid, novel,
potentially useful, and ultimately understandable patterns from large collections of
data [30]. One of the crucial KD steps is a data mining (DM) step. DM is
concerned with the actual extraction of knowledge from data, in contrast to the
KD process, which is concerned with many other activities. We want to stress this
distinction, although people often use the terms DM, KD and DMKD as
synonymous.
Search WWH ::




Custom Search