Database Reference
In-Depth Information
Note, however, that some of the static models already do this to some extend. For
example, the partition model studies subsets of the itemset under investigation. They
do not, however, necessarily take all higher ranked itemsets into account.
In general, we can divide the dynamic approach into two categories. The first
category is iterative pattern mining, where the goal is to greedily build a sequence
of patterns, each pattern being the most surprising given the previous patterns and
background knowledge. This is the class we discuss in this chapter. The second
category is pattern set mining. There the goal is to produce a set of patterns that
together optimize a given score over the whole set, as opposed to scoring patterns
only individually. We will discuss pattern set mining in the next chapter.
Both categories have many technical similarities and share algorithmic ap-
proaches. In fact, some methods are difficult to pigeonhole as they can perform
both tasks. The main difference we identify is that in pattern set mining we are look-
ing for a set of patterns, that is, we need to control the number of patterns, whereas
in iterative pattern ranking, we are 'simply' ranking patterns.
5.1
The General Idea
The main ingredient needed to perform iterative pattern mining, as opposed to static
ranking, is a model that we can update. In particular, we need a model that can
incorporate background knowledge in the same shape as what we're mining: patterns.
As such, the general approach here is that in the first iteration we infer the model
p 1 according to the basic background knowledge
B 1 we may have about the data. We
then rank all patterns accordingly, and select the top- k best scoring/most interesting
patterns,
. We assume the analyst will investigate these in detail,
and hence that now onward we may regard these patterns and what can be derived
from them as 'known'. As such, we update our background knowledge with
X 1 ={
X 1 , ... , X k }
X 1 , and
hence for iteration 2 have
B 2 = B 1 X 1 , for which we infer model p 2 . We then rank
all patterns accordingly, etc, until we're done.
Next we will discuss three methods that allow for dynamic ranking.
5.2
Maximum Entropy Models
Maximum Entropy models, which we've met in the previous section, provide a
natural way of constructing a probabilistic model from a given set of itemsets and
their frequencies: essentially, each itemset is a constraint. As the technical details of
how to infer a model under such constraints are beyond the scope of this chapter,
and we refer the interested reader to, for example, Pavlov et al. [ 53 ].
Given a model that can incorporate itemsets and frequencies as background knowl-
edge, we need to define a score for ranking candidates. We can use the statistical
tests from Sect. 4 , but a more intuitive approach is to use the likelihood of the data
Search WWH ::




Custom Search