Database Reference
In-Depth Information
y
w
y
x
z
z
Fig. 16.1 Subspace clustering: two different groupings of the same data are seen when considering
the subspace consisting of dimensions x and y ( left ) or the subspace consisting of dimensions z and
w ( center ), whereas the subspace projection y and z ( right ) does not show any clear clusters
Fig. 16.2 Frequent itemset
mining: transactions for the
example are listed ( left ),
frequent itemsets are detected
when considering just the
combination of item a and c ,
or when considering a and d ,
but not when considering e.g.
c and d
Transactions
Example frequencies
1
a c
2
a c e
3
a d
a c
4 times
4
a b c
a d
4 times
5
a d
c d
not found
6
a b d
7
a d e
data is present in frequent itemset mining as well (cf. Fig. 16.2 ): an item can be part
of two different patterns such as
{ a , c }
or
{ a , d }
, but the combination of
{ c , d }
does
not necessarily yield frequent patterns.
There are several surveys and overview articles, discussing specifically subspace
clustering [ 9 , 50 , 52 , 53 , 67 , 74 , 83 ], some of which also point out the connection
to frequent pattern mining algorithms. The first survey to discuss the young field
was presented by Parsons et al. [ 67 ], putting the research community's attention
to the problem and sketching a few early algorithms. In the following years, the
problem was studied in much more detail, and categories of similar approaches have
been defined [ 50 ]. A short discussion of the fundamental problems and strategies has
been provided by Kröger and Zimek [ 53 ]. Assent gives an overview in the context
of high-dimensional data of different provenance, including time series and text
documents [ 9 ]. Sim et al. [ 74 ] discuss 'enhanced' subspace clustering, i.e., they
point out particular open problems in the field and discuss methods specifically
addressing those problems. Kriegel et al. [ 52 ] give a concise overview and point to
open questions as well. Based on this overview, an updated discussion was given
by Zimek [ 83 ]. Recent textbooks by Han et al. [ 38 ], and Gan et al. [ 31 ], sketch
prominent issues and example algorithms. Recent experimental evaluation studies
compared some subspace clustering algorithms [ 60 , 63 ].
The close relationship between the two areas subspace clustering and frequent
pattern mining has been elaborated in a broader perspective by Zimek and Vreeken
 
Search WWH ::




Custom Search