Data Analytics: Exploiting the Data Warehouse - Data Warehouse Systems: Design and Implementation - page 337

Database Reference

In-Depth Information

The second step of the algorithm computes the candidate set C 1 with all

the 1-itemsets in db that are not in L 1 .Weonlyhave I 4 = 4 in this situation.

Since I 4 is in both transactions in db ,wehave I 4 .s db =2 > 0 . 5

×

2, and thus,

I 4 will be added to L 1 .

Finally, the updated support count is given in the following table, where

in light gray we indicate the items I with support less than minsup

×

6:

Item

Count

1

4

2

3

3

2

4

3

5

1

6

1

The association analysis studied so far operates over the items in a

database of transactions. However, we have seen that dimension hierarchies

are a way of defining a hierarchy of concepts along which transaction items

can be classified. This leads to the notion of hierarchical association

rules . For example, in the Northwind data warehouse, products are organized

into categories. Assume now that in the original transaction database in our

example above, items 1 and 2 belong to category A ,items 3 and 4 to category

B , and items 5 and 6 to category C . The transaction table with the categories

instead of the items is given below:

TransactionId

Items

1000

{

A,A,B

}

2000

{ A,B }

3000

{

A,B

}

4000

{ A,C,C }

Suppose now that we require minsup = 75% over the items database, we

would obtain no rules as a result. However, aggregating items over categories,

like in the table above, would result in the rules A

A since

categories A and B have support larger than the minimum, namely, 1 and

0.75, respectively. That means we could not say that each time a given item

X appears in the database, an item Y will appear, but we could say that each

time an item of category A appears, an item of category B will be present

too. This is called a hierarchical association rule. Note that combinations of

items at different granularities can also appear, for example, rules like “Each

time a given item X appears in the database, an item of a category C will

also appear.”

⇒

B and B

⇒

Next Page

Data Warehouse Systems: Design and Implementation

Search WWH ::

Custom Search

Home