Information Technology Reference
In-Depth Information
the traditional association rule mining framework in order to cope with this
situation. In addition we present a rule caching schema that allows reducing
the number of mining runs. This schema helps to gain interactivity even in the
presence of extreme run times of the mining algorithms.
2 Association Rules
As mentioned, association rules are a popular mining method for dependency
analysis. In this section we formally define association rules and give an illus-
trative example. In addition we introduce further rule quality measures that
supplement the basic support-confidence framework.
2.1 Formal Definition and Example
Let I = {x 1 ,... ,x n } be a set of distinct literals, called items. A set X ⊆I
with k = |X| is called a k -itemset or simply an itemset. Let a database D be a
multi-set of subsets of I . Each T ∈D is called a transaction.
We say that a transaction T ∈D supports an itemset X ⊆I if X ⊆ T holds.
Let X, Y ⊆I be nonempty itemsets with X ∩ Y = . Then an association rule
is an implication
X → Y,
with rule body X , rule head Y , and rule confidence
conf( X → Y )= |{T ∈D|X ∪ Y ⊆ T}|
|{T ∈D|X ⊆ T}|
.
The confidence can be understood as the conditional probability P ( Y |X ). The
fraction of transactions T supporting an itemset X with respect to database D
is called the support of X ,
supp( X )= |{T ∈D|X ⊆ T}|
|D|
.
The support of a rule X → Y is defined as
supp( X → Y )=supp( X ∪ Y ) .
In the following we want to give a simple numeric example for further il-
lustration. In Table 1 a database D , consisting of eight different transactions,
respectively vehicles together with installed special equipments, is shown.
There are five different items, namely: AirConditioning, 2ndAirbag, Battery-
TypeC, Clutch, and RadioTypeE. The support of item AirConditioning is
supp(AirConditioning)= |{v 1 ,v 4 ,v 5 }|
|D|
= 3
8 =37 , 5%
 
Search WWH ::




Custom Search