Frequent Pattern Mining Algorithms for Data Clustering - Frequent Pattern Mining

Database Reference

In-Depth Information

Chapter 16

Frequent Pattern Mining Algorithms

for Data Clustering

Arthur Zimek, Ira Assent and Jilles Vreeken

Abstract Discovering clusters in subspaces, or subspace clustering and related clus-

tering paradigms, is a research field where we find many frequent pattern mining

related influences. In fact, as the first algorithms for subspace clustering were based

on frequent pattern mining algorithms, it is fair to say that frequent pattern mining was

at the cradle of subspace clustering—yet, it quickly developed into an independent

research field.

In this chapter, we discuss how frequent pattern mining algorithms have been

extended and generalized towards the discovery of local clusters in high-dimensional

data. In particular, we discuss several example algorithms for subspace clustering or

projected clustering as well as point out recent research questions and open topics in

this area relevant to researchers in either clustering or pattern mining.

Keywords Subspace clustering

·

Monotonicity

·

Redundancy

1

Introduction

Data clustering is the task of discovering groups of objects in a data set that exhibit

high similarity. Clustering is an unsupervised task, in that we do not have access to

any additional information besides some geometry of the data, usually represented by

some distance function. Useful groups should consist of objects that are more similar

to each other than to objects assigned to other groups. The goal of the clustering results

is that it provides information for the user regarding different categories of objects

that the data set contains.

Search WWH ::

Custom Search

Home