Information Technology Reference
In-Depth Information
the traditional association rule mining framework in order to cope with this
situation. In addition we present a rule caching schema that allows reducing
the number of mining runs. This schema helps to gain interactivity even in the
presence of extreme run times of the mining algorithms.
2 Association Rules
As mentioned, association rules are a popular mining method for dependency
analysis. In this section we formally define association rules and give an illus-
trative example. In addition we introduce further rule quality measures that
supplement the basic support-confidence framework.
2.1 Formal Definition and Example
Let
I
=
{x
1
,... ,x
n
}
be a set of distinct literals, called items. A set
X ⊆I
with
k
=
|X|
is called a
k
-itemset or simply an itemset. Let a database
D
be a
multi-set of subsets of
I
. Each
T ∈D
is called a transaction.
We say that a transaction
T ∈D
supports an itemset
X ⊆I
if
X ⊆ T
holds.
Let
X, Y ⊆I
be nonempty itemsets with
X ∩ Y
=
∅
. Then an association rule
is an implication
X → Y,
with rule body
X
, rule head
Y
, and rule confidence
conf(
X → Y
)=
|{T ∈D|X ∪ Y ⊆ T}|
|{T ∈D|X ⊆ T}|
.
The confidence can be understood as the conditional probability
P
(
Y |X
). The
fraction of transactions
T
supporting an itemset
X
with respect to database
D
is called the support of
X
,
supp(
X
)=
|{T ∈D|X ⊆ T}|
|D|
.
The support of a rule
X → Y
is defined as
supp(
X → Y
)=supp(
X ∪ Y
)
.
In the following we want to give a simple numeric example for further il-
lustration. In Table 1 a database
D
, consisting of eight different transactions,
respectively vehicles together with installed special equipments, is shown.
There are five different items, namely: AirConditioning, 2ndAirbag, Battery-
TypeC, Clutch, and RadioTypeE. The support of item AirConditioning is
supp(AirConditioning)=
|{v
1
,v
4
,v
5
}|
|D|
=
3
8
=37
,
5%