Database Reference
In-Depth Information
As an example, assume that
{
1
}
is not closed and that
{
1, 2
}
is closed, while we have a
maximum size constraint of 1. Then itemset
would not be in the output in the first
setting, but would be in the output of the second. Constraint-based pattern mining
systems can differ in their approach for dealing with this issue.
Another condensed representation is that of maximal patterns.
{
1
}
Given a generality relation
and constraint ϕ , a pattern π that satisfies constraint ϕ is called
maximal with respect to constraint ϕ if there is no more specific pattern π with π π such
that π satisfies the constraint.
Compared to closed itemsets, maximal itemsets
[ 1 , 20 ] no longer allow one to
recover the supports of a set of patterns.
If the constraint ϕ is a minimum support constraint, one typically refers to maximal
frequent patterns. Whereas maximal frequent patterns are the most popular, it can
also be useful to study maximality with respect to other constraints. Essentially, any
anti-monotonic constraint defines a border in the space of patterns where all patterns
that satisfy the constraints are on one side of the border, while all other patterns that
do not satisfy it are on the other side [ 11 , 20 ].
Similarly, also a monotonic constraint defines a border: in this case, the border
one is looking for is that of minimal patterns that satisfy the constraints.
Different borders can be combined. Probably the most well-known example of
this is found in the analysis of supervised data. If the database consists of two classes
of examples, one can ask for all patterns that are frequent in the one, but infrequent
in the other; the resulting set of patterns has two borders: one of the most specific
patterns in this set, the other of the most general ones.
Boundable Constraints The minimum support constraint is one example of a con-
straint of the kind f ( π )
θ . Over the years, more complex functions have been
studied. One example is that of accuracy (see the chapter on supervised patterns),
which calculates the accuracy of a pattern when used as a classification rule on su-
pervised data. Many such functions no longer have the (anti-)monotonicity property.
In some cases, however, one can identify an alternative function f
such that:
it is feasible to mine all patterns with f ( π )
￿
θ ;
f ( π )
￿
f ( π ).
In this case, all patterns satisfying f ( π )
θ could be determined by first mining all
patterns with f ( π )
θ and then calculating f ( π ) for all patterns found. Function
f can be considered a relaxation of function f [ 29 ]. In Chap. 17 it was discussed
that for supervised data such bounds often exist.
3
Level-Wise Algorithm
Most constraint-based mining algorithms can be seen as generalized versions
of frequent pattern mining algorithms. Similar to frequent itemset mining algo-
rithms, consequently, both breadth-first ( BFS ) or level-wise, and depth-first search
algorithms have been proposed. The earliest techniques typically were BFS
approaches, on which later works improved. Hence, we discuss them first.
Search WWH ::




Custom Search