Database Reference
In-Depth Information
In other words, negative association rules are rules that comprise relationships
between present and absent items. Indeed, items that are not purchased when others
are can be revealing and certainly important in understanding purchasing behaviour.
The association “bread implies milk” indicates the purchasing behaviour of buying
milk and bread together. What about the following associations: “customers who buy
Coke do not buy Pepsi” or “customers who buy juice do not buy bottled water”?
Associations that include negative items (i. e. items absent from the transaction)
can be as valuable as positive associations in many applications, such as devising
marketing strategies. Aggarwal and Yu [ 1 ] discuss some of the weaknesses and the
computational issues for mining positive association rules. They observe that current
methods are especially unsuitable for dealing with dense datasets, which is exactly
the case when one wants to mine negative association rules.
The expensive computation part of association rule mining is the phase enumer-
ating the frequent itemsets (i. e. a set of items). This enumeration takes place in a
search space of size 2 k with k being the number of unique items in the data collection.
Focusing on only positive associations significantly reduces this prohibitive search
space since we only need to count the observed items in the transactions. More-
over, putting the attention on items present in transactions limits the enumeration
of relevant itemsets to a depth dictated by the largest available transaction. These
advantageous stratagems cannot be used if absent items are also considered.
Although interesting and potentially useful, the discovery of negative association
rules is both a complex and computationally expensive problem. We consider a
negative association rule either a negative association between two positive itemsets
or an association rule that contains at least a negative item in the antecedent or
consequent. The mining of negative association rules is a complex problem due to the
increase in items when negative items are considered in the mining process. Imagine
a transaction in market basket analysis where a customer buys bread and milk .
When mining for positive association rules only those two items are considered (i.e.
bread and milk ). However, when negative items are considered (i.e. items/products
not present in a basket/transaction) the search space increases exponentially because
all the items in the collection, although not present in the transaction have to be
considered. Not only is the problem complex, but also large numbers of negative
patterns are uninteresting. The research of mining negative association patterns has
to take into consideration both the complexity of the problem and the usefulness of
the discovered patterns.
2
Negative Patterns and Negative Association Rules
Formally, association rules are defined as follows: Let
be a set of
items. The total number of unique items is m , the dimensionality of the problem.
Let
I ={
i 1 , i 2 , ...i m }
D
be a set of transactions, where each transaction T is a set of items such that
T
. Each transaction is associated with a unique identifier TID . A transaction
T is said to contain X , a set of items in
I
I
,if X T . X is called an itemset.
Search WWH ::




Custom Search