Data Mining of Association Rules and the Process of Knowledge Discovery in Databases - Advances in Data Mining

Information Technology Reference

In-Depth Information

Furthermore we covered the fundamentals of the process of knowledge discovery

in databases.

From both we learned that with regard to human involvement and interac-

tivity the current situation is far from being satisfying. We worked out the basic

problem and than tackled it on three sides:

First of all there is the algorithmic complexity. We demonstrated that to-

day's state of the art algorithms offer impressive performance with regard to the

immense search space they need to deal with. Anyway we came to the conclusion

that this is still not enough to allowtrue interactivity in a human centered KDD

process. Nevertheless we present a rule caching schema that significantly reduces

the number of mining runs. This schema helps to gain interactivity even in the

presence of extreme run times of the mining algorithms. Accessing a properly

implemented cache only takes seconds.

Second, we pointed out that the integration of the mining algorithm with

the other KDD phases is also a crucial aspect. Interactivity tremendously suf-

fers when proceeding from one KDD phase to next is not smooth but implies

annoying user interference. For that purpose we present an e - cient integration

of association rule mining algorithms with modern database systems.

Third, interesting rules must be picked by the data mining analyst from

the set of generated rules. This might be quite costly because the generated rule

sets normally are quite large - e.g. more than 100 , 000 rules are not uncommon -

whereas the percentage of useful rules is typically only a very small fraction. We

enhanced the traditional association rule mining framework by giving structure

to the items. Adding attributes to the items as proposed does not affect the

mining procedure but introduces a newmeans to formulate practically important

mining queries.

References

1. P. Adriaans and D. Zantinge. Data Mining . Addison-Wesley, Harlow, England,

1996.

2. R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of

items in large databases. In Proceedings of the ACM SIGMOD International Con-

ference on Management of Data (ACM SIGMOD '93) , pages 207-216, Washington,

USA, May 1993.

3. R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Pro-

ceedings of the 20th International Conference on Very Large Databases (VLDB

'94) , Santiago, Chile, June 1994.

4. T. Barth. Guidelines for the data mining process. Technical report, University of

Stuttgart, Stuttgart, Germany, 1998. ESPRIT Project Number 22700.

5. R. J. Brachman and T. Anand. The process of knowledge discovery in databases:

A human centered approach. In U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth,

and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining ,

chapter 2, pages 37-57. AAAI/MIT Press, 1996.

6. S. Brin, R. Motwani, and C. Silverstein. Beyond market baskets: Generalizing

association rules to correlations. In Proceedings of the ACM SIGMOD International

Conference on Management of Data (ACM SIGMOD '97) , pages 265-276, 1997.

Advances in Data Mining

Search WWH ::

Custom Search

Home