Databases Reference
In-Depth Information
Apriori-TFP 1 based (i.e., [11-13,16,17], etc.), and (2) FP-tree based (i.e.,
[9, 24, 27, 39], etc.).
Mining ARs from MFIs
It is apparent that the size of a complete set of FIs can be very large. The
concept of MFI [47] was proposed to find several “long” (super) FIs in D T ,
which avoids the redundant work required to identify “short” FI. The con-
cept of vertical mining has also been effectively promoted in this category [54].
Vertical mining, first mentioned in [32], deals with a vertical transaction data-
base D T , where each data-record represents an item that is associated with a
list of its relative transactions (the transactions in which it is present). Typ-
ical MFI algorithms include: MaxEclat/Eclat [54], MaxClique/Clique [54],
Max-Miner [47], Pincer-Search [37], MAFIA [8], GenMax [26], etc.
Mining ARs from FCIs
Algorithms belonging to this category extract ARs through generating a set
of FCIs from D T . In fact the support of some sub-itemsets of an MFI might
be hard to identified resulting in a further di culty in the computation of
confidence. The concept of FCI [44] is proposed to improve this property
of MFI, which avoids the di culty of identifying the support of any sub-
itemsets of a relatively “long” FI. A FCI f is an itemset S
D T ,where f
f and f shares a common support with
f . The relationship between FI, MFI and FCI is that MFI
itemset f
is frequent and
¬∃
FI [8].
In this category typical algorithms include: CLOSET [44], CLOSET+ [51],
CHARM [55], MAFIA [8], etc.
FCI
2.2 Classification Rule Mining
CRM deals with D C ,where D C is founded as D R
D E . It discovers a set of
CRs in D R from which to build a classifier to classify “unseen” data records in
D E .A D R consists of n data-attributes and m data records. By convention the
last data-attribute in each data-record usually indicates its pre-defined class,
noted as the class attribute. CRM can thus be described as the process of
assigning a Boolean value to each pair ( d j ,c i )
D E
is an “unseen” data-record, C as declared in Sect. 1 is a set of pre-defined
classes, and ( d j ,c i ) is a data-record in D E to be labeled.
D E ×
C ,whereeach d j
The Cover Algorithm
In CRM a number of approaches have been proposed to generate a classifier
from a set of training data-records. For example the Cover Algorithm [41]
takes D R as its input and aims to generate a complete set of minimal non-
redundant CRs. We define the Cover algorithm in Fig. 1 as follows.
1 Apriori-TFP and its related softwares may be obtained from http://www.csc.liv.
ac.uk/frans/KDD/Software.
Search WWH ::




Custom Search