Database Reference
In-Depth Information
y
w
y
x
z
z
Fig. 16.1
Subspace clustering: two different groupings of the same data are seen when considering
the subspace consisting of dimensions
x
and
y
(
left
) or the subspace consisting of dimensions
z
and
w
(
center
), whereas the subspace projection
y
and
z
(
right
) does not show any clear clusters
Fig. 16.2
Frequent itemset
mining: transactions for the
example are listed (
left
),
frequent itemsets are detected
when considering just the
combination of item
a
and
c
,
or when considering
a
and
d
,
but not when considering e.g.
c
and
d
Transactions
Example frequencies
1
a c
2
a c e
3
a d
a c
4 times
4
a b c
a d
4 times
5
a d
c d
not found
6
a b d
7
a d e
data is present in frequent itemset mining as well (cf. Fig.
16.2
): an item can be part
of two different patterns such as
{
a
,
c
}
or
{
a
,
d
}
, but the combination of
{
c
,
d
}
does
not necessarily yield frequent patterns.
There are several surveys and overview articles, discussing specifically subspace
clustering [
9
,
50
,
52
,
53
,
67
,
74
,
83
], some of which also point out the connection
to frequent pattern mining algorithms. The first survey to discuss the young field
was presented by Parsons et al. [
67
], putting the research community's attention
to the problem and sketching a few early algorithms. In the following years, the
problem was studied in much more detail, and categories of similar approaches have
been defined [
50
]. A short discussion of the fundamental problems and strategies has
been provided by Kröger and Zimek [
53
]. Assent gives an overview in the context
of high-dimensional data of different provenance, including time series and text
documents [
9
]. Sim et al. [
74
] discuss 'enhanced' subspace clustering, i.e., they
point out particular open problems in the field and discuss methods specifically
addressing those problems. Kriegel et al. [
52
] give a concise overview and point to
open questions as well. Based on this overview, an updated discussion was given
by Zimek [
83
]. Recent textbooks by Han et al. [
38
], and Gan et al. [
31
], sketch
prominent issues and example algorithms. Recent experimental evaluation studies
compared some subspace clustering algorithms [
60
,
63
].
The close relationship between the two areas subspace clustering and frequent
pattern mining has been elaborated in a broader perspective by Zimek and Vreeken