Parallel Execution of SQL Based Association Rule Mining - Nontraditional Database Systems

Databases Reference

In-Depth Information

13

Parallel Execution of SQL Based

Association Rule Mining

Masaru Kitsuregawa

Iko Pramudiono

Institute of Industrial Science, The University of Tokyo

Takeshi Yoshizawa

IBM Japan Co., Ltd

Takayuki Tamura

Mitsubishi Electric

ABSTRACT

Association rule mining over large scale databases has been recognized as powerful

tool to extract hidden precious information from those databases. However in most

cases, user has to pull out data from database and relies on external specialized program

to perform the mining. Here we present our examination on association rule mining

using native SQL on parallel platforms such as experimental PC cluster and also

commercial parallel RDBMS. The integration on RDB framework offers the portability,

and ease-of-maintenance. We show that parallelism is the key to achieve sufficient

performance.

1 Introduction

In the business world data mining over data warehouse has become a crucial

weapon to gain competitive edge against competitors. Those organizations have

accumulated large amount of transaction data by mean of data collection tools

such as POS and they want to extract value added information such as unknown

buying patterns from that large databases. This demand has fueled the growing

popularity of data mining.

One method of data mining is finding association rule that is a rule which

implies certain association relationship such as ”occur together” or ”one implies

the other” among a set of objects 1) . This mining that is also known as “basket

data analysis” retrieves information like “90% of the customers who buy A and B

also buy C” from transaction data.

197

Search WWH ::

Custom Search

Home