Database Reference
In-Depth Information
Figure 1: The Knowledge Discovery and Data Mining process
INTRODUCTION
Today, with the advent of the web and electronic commerce, nearly every organization
has a web site where tremendous amounts of customer data have been generated and collected.
These customer data contain a wealth of potentially accessible information. However, the
explosive growth of data will inevitably lead to a situation such that it is increasingly diffi cult
to access the desired information. As a result, there are great demands for analyzing data and
transforming them into useful information and knowledge. Therefore, Knowledge Discovery
and Data Mining (KDD) has become an important fi eld in recent years to address the need
for analyzing data in very large data repositories.
KDD is the process of automatic extraction of implicit, novel, useful, and understandable
patterns in large databases. There are many steps in the KDD process, which include data
selection, data cleaning, enrichment, coding, data-mining task, algorithm selection, and
interpretation of discovered knowledge (Adriaans & Zantinge, 1996). This process tends
to be interactive, incremental and iterative. Figure 1 illustrates the steps of the knowledge
discovery process.
There is a relationship between the activities of data mining and data warehouse - the
architecture foundation of decision support systems. The data warehouse sets the stage for
effective data mining. The data mining can be done without data warehouse, but the data
warehouse can improve the chances of success in data mining (Inmon, 1996).
Background
As the usage of the World Wide Web explodes, a massive amount of data is generated
by web servers in the form of web access logs. It is a rich source of information for
understanding web user surfi ng behavior. Web usage mining is one type of web mining
activity that involves the automatic discovery of user access patterns on one or more web
servers. Also, it applies data mining algorithms to web access logs to locate the regularities
in web users' access patterns.
Analysis of these access data provides useful information for server performance
enhancements, restructuring web sites, and direct marketing in electronic commerce. As a
result, web usage mining has been used in improving web site design, business and marketing
decision support, user profi ling, and web server system performance, etc.
Among methods of discovering various knowledge in large databases, the association
rule has attracted great attention in database research communities in recent years (Agrawal,
Search WWH ::




Custom Search