Frequent Pattern Mining Algorithms for Data Clustering - Frequent Pattern Mining - page 398

Database Reference

In-Depth Information

y

w

y

x

z

z

Fig. 16.1 Subspace clustering: two different groupings of the same data are seen when considering

the subspace consisting of dimensions x and y ( left ) or the subspace consisting of dimensions z and

w ( center ), whereas the subspace projection y and z ( right ) does not show any clear clusters

Fig. 16.2 Frequent itemset

mining: transactions for the

example are listed ( left ),

frequent itemsets are detected

when considering just the

combination of item a and c ,

or when considering a and d ,

but not when considering e.g.

c and d

Transactions

Example frequencies

1

a c

2

a c e

3

a d

a c

4 times

4

a b c

a d

4 times

5

a d

c d

not found

6

a b d

7

a d e

data is present in frequent itemset mining as well (cf. Fig. 16.2 ): an item can be part

of two different patterns such as

{ a , c }

or

{ a , d }

, but the combination of

{ c , d }

does

not necessarily yield frequent patterns.

There are several surveys and overview articles, discussing specifically subspace

clustering [ 9 , 50 , 52 , 53 , 67 , 74 , 83 ], some of which also point out the connection

to frequent pattern mining algorithms. The first survey to discuss the young field

was presented by Parsons et al. [ 67 ], putting the research community's attention

to the problem and sketching a few early algorithms. In the following years, the

problem was studied in much more detail, and categories of similar approaches have

been defined [ 50 ]. A short discussion of the fundamental problems and strategies has

been provided by Kröger and Zimek [ 53 ]. Assent gives an overview in the context

of high-dimensional data of different provenance, including time series and text

documents [ 9 ]. Sim et al. [ 74 ] discuss 'enhanced' subspace clustering, i.e., they

point out particular open problems in the field and discuss methods specifically

addressing those problems. Kriegel et al. [ 52 ] give a concise overview and point to

open questions as well. Based on this overview, an updated discussion was given

by Zimek [ 83 ]. Recent textbooks by Han et al. [ 38 ], and Gan et al. [ 31 ], sketch

prominent issues and example algorithms. Recent experimental evaluation studies

compared some subspace clustering algorithms [ 60 , 63 ].

The close relationship between the two areas subspace clustering and frequent

pattern mining has been elaborated in a broader perspective by Zimek and Vreeken

Next Page

Frequent Pattern Mining

Search WWH ::

Custom Search

Home