Database Reference
In-Depth Information
Chapter 2
Background
In this chapter we provide the background and terminology that are necessary for the
understanding of association rule hiding. Specifically, in Section 2.1, we present the
theory behind association rule mining and introduce the notion of the positive and
the negative borders of the frequent itemsets. Following that, Section 2.2 explicitly
states the goals of association rule hiding methodologies, discusses the different
types of solutions that association rule hiding algorithms can produce, as well as
it delivers the formal problem statement for association rule hiding and its popular
variant, frequent itemset hiding.
2.1 Terminology and Preliminaries
Association rule mining is the process of discovering sets of items (also known
as itemsets) that frequently co-occur in a transactional database so as to produce
significant association rules that hold for the data. Each association rule is defined
as an implication of the form I ) J, where I; J are frequently occurring itemsets
in the transactional database, for which I \J = ? (i.e., I and J are disjoint). The
itemset I[J that leads to the generation of an association rule is called generating
itemset. An association rule consists of two parts: the Left Hand Side (LHS) or
antecedent, which is the part on the left of the arrow of the rule (here I), and the
Right Hand Side (RHS) or consequent, which is the part on the right of the arrow of
the rule (here J). Two metrics, known as support and confidence, are incorporated
to the task of association rule mining to drive the generation of association rules
and expose only those rules that are expected to be of interest to the data owner. In
particular, the measure of support eliminates rules that are not adequately backed
up by the transactions of the dataset and thus are expected to be uninteresting, i.e.
occurring simply by chance. On the other hand, confidence measures the strength
of the relation between the itemsets of the rule as it quantifies the reliability of the
inference made by the rule [68]. A low value of confidence in a rule I ) J shows
that it is rather rare for itemset J to be present in transactions that contain itemset I.
Search WWH ::




Custom Search