Java Reference
In-Depth Information
record case, we would require space for 10,000 entries per transaction.
If we have 1,000 transactions in our dataset, this is a total of 10,000,000
entries. However, if we use multirecord case format, we store only data
on 20,000 items (20 items/transaction
1,000 transactions). Each item
requires 3 entries (two minimally), which amounts to 60,000 entries.
Clearly, 10,000,000 greatly exceeds the sparse representation of 60,000.
Association rules are interesting for showing relationships among
items, but can also be interesting for showing relationships among
item categories. To generate association rules that include category
rules, some association algorithms can take a taxonomy as input
which shows the relationships among items. Each item is associated
with one or more categories. Categories in turn can belong to one or
more other categories. The overall taxonomy cannot contain any
cycles; that is, a category can end up being its own parent, directly or
indirectly.
Consider the example in Figure 4-7, which illustrates four subcat-
egories of food: fruits, meats, grains, and dairy . Fruits are further sub-
categorized into native and imported fruit, and fresh and canned fruit.
Pineapple exists in both fresh and imported forms. Apples are both
fresh and native fruits. Each of the categories provided may further
be subdivided into finer categories or linked to specific items, for
example, Royal Gala apples.
Whereas association models normally find rules among items,
given such a taxonomy, an association algorithm can also identify
rules among categories. For example, “dairy implies grains” (one
category implies another category), “dairy implies Rice Krispies”
Food
Fruits
Meats
Grains
Dairy
NativeFruits
ImportedFruits
FreshFruits
CannedFruits
Pork
Beef
Chicken
FruitCocktail
Apple
Pineapple
Figure 4-7
Taxonomy for food items.
Search WWH ::




Custom Search