Database Reference
In-Depth Information
The second step of the algorithm computes the candidate set C 1 with all
the 1-itemsets in db that are not in L 1 .Weonlyhave I 4 = 4 in this situation.
Since I 4 is in both transactions in db ,wehave I 4 .s db =2 > 0 . 5
×
2, and thus,
I 4 will be added to L 1 .
Finally, the updated support count is given in the following table, where
in light gray we indicate the items I with support less than minsup
×
6:
Item
Count
1
4
2
3
3
2
4
3
5
1
6
1
The association analysis studied so far operates over the items in a
database of transactions. However, we have seen that dimension hierarchies
are a way of defining a hierarchy of concepts along which transaction items
can be classified. This leads to the notion of hierarchical association
rules . For example, in the Northwind data warehouse, products are organized
into categories. Assume now that in the original transaction database in our
example above, items 1 and 2 belong to category A ,items 3 and 4 to category
B , and items 5 and 6 to category C . The transaction table with the categories
instead of the items is given below:
TransactionId
Items
1000
{
A,A,B
}
2000
{ A,B }
3000
{
A,B
}
4000
{ A,C,C }
Suppose now that we require minsup = 75% over the items database, we
would obtain no rules as a result. However, aggregating items over categories,
like in the table above, would result in the rules A
A since
categories A and B have support larger than the minimum, namely, 1 and
0.75, respectively. That means we could not say that each time a given item
X appears in the database, an item Y will appear, but we could say that each
time an item of category A appears, an item of category B will be present
too. This is called a hierarchical association rule. Note that combinations of
items at different granularities can also appear, for example, rules like “Each
time a given item X appears in the database, an item of a category C will
also appear.”
B and B
 
Search WWH ::




Custom Search