Database Reference
In-Depth Information
However, assume that we would use the following generality order:
π 1 π 2 if we can obtain π 1 from π 2 by repeatedly removing from π 2 the item with the
highest cost.
Then under this order the constraint is anti-monotonic: after all, by removing the
most costly items from an itemset, the average cost of the items in the itemset can
only go down and it hence also must satisfy the constraint.
Note that this order is compatible with the use of the subset relation as cov-
erage relation; hence, this constraint can be combined with a minimum support
constraint. Such an order can be incompatible with another order needed to make
another constraint anti-monotonic, however.
Constraints which have this property, i.e., that a different generality relation
needs to be used than the coverage relation to obtain (anti-)monotonicity, are called
convertible (anti)-monotonic in the literature [ 26 ].
Succinctness Succinctness was originally defined for itemsets [ 23 ], but we will use
a slightly different definition here which is applicable to other pattern languages as
well: we will call any constraint succinct that can be enforced by manipulating the
data. Consider the following two examples:
￿
we want to find frequent itemsets without item i : if we remove item i from the
database, we will no longer find such itemsets;
￿
we want to find frequent itemsets that include item i : if we remove all transactions
without item i from the database, and then remove item i from the remaining
transactions, we can add item i to every itemset we find in the resulting database
to obtain the desired set of itemsets.
These examples can easily be generalized to require the inclusion or exclusion of an
itemset π
I
.
Condensed representations The set of patterns satisfying the above constraints
may still be large. Condendensed representations consitute an additional approach
for reducing a set of patterns. The main idea is to determine a small set of patterns
that still is sufficiently large to determine a full set of patterns. The property that a
pattern is part of a condensed representation can also be seen as a constraint.
We will discuss two of the most well-known cases here.
Given a generality relation
, a pattern π is called closed if there is no more specific pattern
π with π π such that cover ( π )
= cover ( π ).
Intuitively, closed frequent patterns [ 25 ] allow one to recover a set of frequent
itemsets together with their supports.
A subtle issue is the combination of the closedness constraint with other con-
straints. As an example, consider the maximum size constraint. One can distinguish
two settings:
￿
a setting in which one searches for patterns satisfying the size constraint among
those patterns that are closed;
￿
a setting in which one searches for patterns that are closed, restricting the set of
patterns that are considered in the closedness definition only to those that satisfy
the constraint.
Search WWH ::




Custom Search