Databases Reference
In-Depth Information
13
Parallel Execution of SQL Based
Association Rule Mining
Masaru Kitsuregawa
Iko Pramudiono
Institute of Industrial Science, The University of Tokyo
Takeshi Yoshizawa
IBM Japan Co., Ltd
Takayuki Tamura
Mitsubishi Electric
ABSTRACT
Association rule mining over large scale databases has been recognized as powerful
tool to extract hidden precious information from those databases. However in most
cases, user has to pull out data from database and relies on external specialized program
to perform the mining. Here we present our examination on association rule mining
using native SQL on parallel platforms such as experimental PC cluster and also
commercial parallel RDBMS. The integration on RDB framework offers the portability,
and ease-of-maintenance. We show that parallelism is the key to achieve sufficient
performance.
1 Introduction
In the business world data mining over data warehouse has become a crucial
weapon to gain competitive edge against competitors. Those organizations have
accumulated large amount of transaction data by mean of data collection tools
such as POS and they want to extract value added information such as unknown
buying patterns from that large databases. This demand has fueled the growing
popularity of data mining.
One method of data mining is finding association rule that is a rule which
implies certain association relationship such as ”occur together” or ”one implies
the other” among a set of objects 1) . This mining that is also known as “basket
data analysis” retrieves information like “90% of the customers who buy A and B
also buy C” from transaction data.
197
Search WWH ::




Custom Search