Mining and Using Sets of Patterns through Compression - Frequent Pattern Mining - page 178

Database Reference

In-Depth Information

4.1.1

Single-pass Filtering

The simplest filtering approach uses the following greedy search strategy:

1. Start with an 'empty' model M .

2. Start with an 'empty' model M .

3. Add patterns F

to M one by one. If the addition leads to better compression,

keep it, otherwise, permanently discard F .

∈ F

Although the basic principle of this approach is very simple, note that there are

important details that need to be worked out depending on the specific model and

encoding. For example, it is often impossible to start with a model that is truly empty:

if a model does not contain any patterns at all, it may be impossible to encode the

data at hand and hence there is no compressed size to start from. Also, adding a

pattern to a model is not always straightforward: how and where in the model should

it be added? Depending on design choices, there may be many possibilities and if

these need all to be tested this can become a computational burden. Finally, in what

order should we consider the candidates in

? Since single-pass filtering considers

every candidate only once, this choice will have a large impact on the final result.

F

Example 8.12 KRIMP employs single-pass filtering with several heuristic choices

to ensure that it can induce good code tables from relatively large datasets and

candidate sets in reasonable time.

To ensure any transaction can be encoded, the induction process departs from the

code table containing all singleton itemsets, i.e.,

. Candidate itemsets are

considered in a fixed order, on frequencies and lengths, maximizing the probability

that we encounter candidates that aid compression. Finally, with the same goal,

we imposed an order on the itemsets in the code table. Together with the cover

function, which does not allow overlap, this means that each candidate itemset can

be efficiently evaluated. To further illustrate this example, the KRIMP algorithm is

given as Algorithm 5 .

{{ i }| i ∈ I }

Next Page

Frequent Pattern Mining

Search WWH ::

Custom Search

Home