Database Reference
In-Depth Information
Chapter 5
Interesting Patterns
Jilles Vreeken and Nikolaj Tatti
Abstract Pattern mining is one of the most important aspects of data mining. By
far the most popular and well-known approach is frequent pattern mining. That
is, to discover patterns that occur in many transactions. This approach has many
virtues including monotonicity, which allows efficient discovery of all frequent pat-
terns. Nevertheless, in practice frequent pattern mining rarely gives good results—the
number of discovered patterns is typically gargantuan and they are heavily redundant.
Consequently, a lot of research effort has been invested toward improving the
quality of the discovered patterns. In this chapter we will give an overview of the
interestingness measures and other redundancy reduction techniques that have been
proposed to this end.
In particular, we first present classic techniques such as closed and non-derivable
itemsets that are used to prune unnecessary itemsets. We then discuss techniques for
ranking patterns on how expected their score is under a null hypothesis—considering
patterns that deviate from this expectation to be interesting. These models can either
be static, as well as dynamic; we can iteratively update this model as we discover
new patterns. More generally, we also give a brief overview on pattern set mining
techniques, where we measure quality over a set of patterns, instead of individually.
This setup gives us freedom to explicitly punish redundancy which leads to a more
to-the-point results.
Keywords Pattern mining
·
Interestingness measures
·
Statistics
·
Ranking
·
Pattern
set mining
Search WWH ::




Custom Search